We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Staff MLE/Infra Engineer
Location: BOSTON or NYC (Hybrid)
|
Full-Time
MLE
Machine Learning Engineer
MLOps
Infrastructure
Staff Engineer
Python
GCP
Kubernetes
Docker
LLM
AI
Healthcare
Boston
NYC
Hybrid
Vertex AI
AI Engineer
Data Engineer
Staff Engineer
Company: Layer Health is building an AI layer for healthcare, founded by MIT/Harvard researchers, focusing on synthesizing medical records using LLMs to reduce friction and improve patient care. We've raised a $21M Series A. Role: Staff+ MLE/Infra Engineer Location: Hybrid in Boston (Back Bay) or NYC (Grand Central). No remote option available. Join our ~20 person team as a senior engineer focused on the infrastructure supporting our machine learning models, particularly Large Language Models (LLMs). You will design, build, and manage the systems for training, evaluating, deploying, and monitoring ML models at scale on GCP, enabling our research and product teams to iterate quickly and reliably. Responsibilities: * Design, build, and maintain scalable infrastructure for ML model training, deployment, and inference (MLOps). * Optimize ML workflows for performance, cost, and reliability on GCP. * Develop tools and automation for model versioning, experiment tracking, and monitoring. * Collaborate with Research Scientists and MLEs to productionize models, including LLMs. * Work with backend and data platform teams to ensure seamless integration of ML models. * Stay current with the latest advancements in MLOps, LLM infrastructure, and cloud technologies. * Ensure security and compliance of ML systems handling sensitive medical data. Technical Skills: * Strong software engineering background, proficient in Python. * Proven experience building and managing ML infrastructure (MLOps). * Deep knowledge of cloud platforms, preferably GCP (AI Platform, Kubernetes Engine, Vertex AI). * Experience with containerization (Docker, Kubernetes) and orchestration. * Familiarity with ML frameworks (e.g., PyTorch, TensorFlow) and LLM ecosystem tools. * Experience with data processing pipelines and tools. * Understanding of infrastructure-as-code (Terraform) and CI/CD practices. Ideal Candidate: * Minimum 4 years (Staff level implies significantly more experience is likely preferred) of professional experience in software engineering, with a focus on ML infrastructure or MLOps. * Experience deploying and scaling machine learning models, especially LLMs, in production. * Strong problem-solving skills applied to complex infrastructure and ML challenges. * Excellent communication and collaboration skills. * Passion for applying AI/ML to solve real-world healthcare problems.
Post Date:
May 21, 2025