Location: San Francisco, United States   |   Full-Time   |   $190,000 - $209,000
Python SQL Spark AWS ETL Data Warehousing Kafka Data Engineer

About Alembic

Alembic is pioneering a revolution in marketing, proving the true ROI of marketing activities. The Alembic Marketing Intelligence Platform applies sophisticated algorithms and AI models to finally solve this long-standing problem. When you join the Alembic team, you’ll help build the tools that provide unprecedented visibility into how marketing drives revenue, helping a growing list of Fortune 500 companies make more confident, data-driven decisions.

About the Role

As a Senior Data Engineer at Alembic, you will be at the core of our data platform, building scalable and reliable data pipelines, optimizing storage solutions, and enabling real-time and batch analytics. You will work closely with data scientists, software engineers, and product leaders to design and implement robust data architectures.

Key Responsibilities

  • Design, develop, and maintain scalable ETL pipelines that ingest, process, and transform large volumes of structured and unstructured data.
  • Optimize data storage solutions using modern data lakehouse architectures and best practices for cost, performance, and reliability.
  • Collaborate with data scientists and engineers to integrate machine learning models and analytical workloads into production environments.
  • Ensure data integrity, quality, and security by implementing monitoring, alerting, and governance best practices.
  • Work with cloud-based data warehouses and distributed data processing frameworks.
  • Continuously evaluate and implement new technologies to improve data infrastructure and operational efficiency.

What We’re Looking For

  • 10+ years of experience in data engineering, software engineering, or a related field.
  • Strong expertise in SQL and Python for data processing.
  • Experience with modern data warehousing and lakehouse solutions (i.e. Iceberg or similar).
  • Proficiency in working with distributed systems and big data technologies (Apache Spark, Hadoop, Kafka, Flink).
  • Hands-on experience with cloud platforms (AWS, GCP, Azure) and related data services.
  • Deep understanding of data modeling, database design, and performance optimization.
  • Familiarity with CI/CD pipelines, containerization (Docker, Kubernetes), and infrastructure-as-code (Terraform, CloudFormation) for data pipelines.
  • Strong problem-solving skills, with a passion for building reliable, scalable, and maintainable data systems.
  • Excellent communication skills and the ability to collaborate in a cross-functional team.

Nice to Have

  • Experience with Graph Databases, NoSQL, or Time-Series Databases.
  • Familiarity with data privacy, governance, and compliance (GDPR, HIPAA, SOC 2).
  • Experience with machine learning pipelines and MLOps.

Why Join Alembic?

  • High-impact role: Shape the future of our data platform at an early-stage startup.
  • Growth opportunities: Work in a fast-paced environment with opportunities to take on new challenges.
  • Collaborative culture: Join a team of passionate, skilled engineers and technologists.
  • Competitive compensation: Including salary, equity, and benefits.
Post Date: July 29, 2025