Location: Santa Clara, CA   |   Full-Time   |   $180,000 - $220,000
VLM Vision Language Model Computer Vision CV ML Machine Learning PyTorch HuggingFace ViT Vision Transformer Quantization Distillation CUDA Triton vLLM Python GCP AWS Docker Conda Ray TDD MLOps AI Founding Engineer AI Engineer Staff Engineer
Join VLM Run, the Unified Gateway for Visual AI, as a Founding CV / ML Engineer. VLM Run is building an end-to-end platform for developers to fine-tune, specialize, and operationalize Vision Language Models (VLMs). We are seeking exceptional engineers to advance our core VLM capabilities (vision-language understanding, model architecture innovation, OCR, function-calling enhancements, fine-tuning acceleration, robustness improvements) and build/optimize our full VLM stack (model compilation, acceleration, distillation, cost-efficient serving/scaling). You should bring creativity, problem-solving skills, and a strong work ethic.

Required Expertise (BS & 4+ YoE, MS & 2+ YoE):
*   Training: Vision Transformers (ViTs), PyTorch, HuggingFace (trl, transformers, peft), advanced quantization/distillation techniques, latest open-source VLM architectures (Llama, Qwen, Phi etc).
*   Serving: CUDA optimizations, torch.compile, OpenAI triton kernel authoring, serving infra (vLLM, ollama, native HF), speculative/guided decoding etc.
*   SW / DevOps: Python, GCP/AWS, Github SW dev-cycle, docker, conda, ray, Test-driven development (TDD).

Bonus: GitHub repo with 1K+ stars, published impactful ML/CV research or software, SaaS/AI application building experience (landing pages, auth, billing, logging, telemetry, infra). No advanced degree required if you're a hacker who stands out with past ML systems or research.

The company is founded by veteran AI experts (20+ YoE from Meta, Tesla Autopilot, Cruise, Toyota Research, MIT) and backed by leading VCs. Competitive compensation and benefits including healthcare, dental, and 401K.
Post Date: April 15, 2025