Teton Job - AI Inference Engineer

About Teton: Teton specializes in deploying cutting-edge AI technologies in healthcare settings, creating privacy-first solutions that improve patient monitoring and care delivery.

Role Overview: As an AI Inference Engineer at Teton, you’ll focus on optimizing and deploying machine learning models for real-time healthcare applications. You’ll ensure our AI systems perform reliably in production environments.

Key Responsibilities:

Optimize ML models for low-latency inference
Develop and maintain model serving infrastructure
Collaborate with researchers to operationalize models
Implement monitoring and versioning for deployed models
Ensure model performance meets clinical requirements
Stay current with inference optimization techniques

Required Skills:

Experience with model optimization techniques
Knowledge of hardware acceleration (GPU, TPU)
Proficiency with model serving frameworks
Understanding of quantization and pruning techniques
Strong Python programming skills

Ideal Candidate:

3+ years of experience in ML inference
Background in computer architecture preferred
Experience with model compression techniques
Familiarity with healthcare AI applications
Ability to balance performance with resource constraints
Passion for deploying impactful AI solutions