Location: Copenhagen, Denmark - Onsite   |   Full-Time
Inference Optimization Model Deployment Quantization AI AI Engineer Back End Engineer

About Teton: Teton specializes in deploying cutting-edge AI technologies in healthcare settings, creating privacy-first solutions that improve patient monitoring and care delivery.

Role Overview: As an AI Inference Engineer at Teton, you’ll focus on optimizing and deploying machine learning models for real-time healthcare applications. You’ll ensure our AI systems perform reliably in production environments.

Key Responsibilities:

  • Optimize ML models for low-latency inference
  • Develop and maintain model serving infrastructure
  • Collaborate with researchers to operationalize models
  • Implement monitoring and versioning for deployed models
  • Ensure model performance meets clinical requirements
  • Stay current with inference optimization techniques

Required Skills:

  • Experience with model optimization techniques
  • Knowledge of hardware acceleration (GPU, TPU)
  • Proficiency with model serving frameworks
  • Understanding of quantization and pruning techniques
  • Strong Python programming skills

Ideal Candidate:

  • 3+ years of experience in ML inference
  • Background in computer architecture preferred
  • Experience with model compression techniques
  • Familiarity with healthcare AI applications
  • Ability to balance performance with resource constraints
  • Passion for deploying impactful AI solutions
Post Date: June 27, 2025