Location: San Francisco Bay Area   |   Full-Time   |   $190,000 - $250,000
AI ML Inference LLM Python C++ TensorRT-LLM Kubernetes PyTorch TensorFlow ONNX CUDA GPU Distributed Systems Real-time Observability AI Engineer
Company: Perplexity AI
Location: San Francisco Bay Area (SF or Palo Alto)

About Perplexity AI:
At Perplexity, we've experienced tremendous growth and adoption since publicly launching the world's first fully functional conversational answer engine in 2022. We've grown from answering 2.5 million questions per day at the start of 2024 to around 20 million daily queries in December 2024. We also offer Perplexity Enterprise Pro, which counts leading companies like Nvidia, the Cleveland Cavaliers, Bridgewater, and Zoom as customers. To support our rapid expansion, we've raised significant funding from respected technology investors like IVP, NEA, Jeff Bezos, NVIDIA, Databricks, Bessemer Venture Partners, and many visionary individuals.

Role Overview:
You will work on large-scale deployment of machine learning models for real-time inference, developing and optimizing the systems that power Perplexity's AI capabilities.

Current Stack: Python, C++, TensorRT-LLM, Kubernetes

Responsibilities:
- Develop APIs for AI inference that will be used by both internal and external customers
- Benchmark and address bottlenecks throughout our inference stack
- Improve the reliability and observability of our systems and respond to system outages
- Explore novel research and implement LLM inference optimizations

Qualifications:
- Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
- Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
- Understanding of GPU architectures or experience with GPU kernel programming using CUDA
- Experience with deploying reliable, distributed, real-time model serving at scale is a plus.

Compensation & Benefits:
- Cash compensation range: $190,000 - $250,000 (Final offer amounts determined by multiple factors including experience and expertise).
- Equity may be part of the total compensation package.
- Comprehensive health, dental, and vision insurance for you and your dependents. Includes a 401(k) plan.
Post Date: May 20, 2025