We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
AI Inference Engineer (SF)
Location: San Francisco Bay Area
|
Full-Time
|
$190,000 -
$250,000
AI
ML
Inference
LLM
Python
C++
TensorRT-LLM
Kubernetes
PyTorch
TensorFlow
ONNX
CUDA
GPU
Distributed Systems
Real-time
Observability
AI Engineer
Company: Perplexity AI Location: San Francisco Bay Area (SF or Palo Alto) About Perplexity AI: At Perplexity, we've experienced tremendous growth and adoption since publicly launching the world's first fully functional conversational answer engine in 2022. We've grown from answering 2.5 million questions per day at the start of 2024 to around 20 million daily queries in December 2024. We also offer Perplexity Enterprise Pro, which counts leading companies like Nvidia, the Cleveland Cavaliers, Bridgewater, and Zoom as customers. To support our rapid expansion, we've raised significant funding from respected technology investors like IVP, NEA, Jeff Bezos, NVIDIA, Databricks, Bessemer Venture Partners, and many visionary individuals. Role Overview: You will work on large-scale deployment of machine learning models for real-time inference, developing and optimizing the systems that power Perplexity's AI capabilities. Current Stack: Python, C++, TensorRT-LLM, Kubernetes Responsibilities: - Develop APIs for AI inference that will be used by both internal and external customers - Benchmark and address bottlenecks throughout our inference stack - Improve the reliability and observability of our systems and respond to system outages - Explore novel research and implement LLM inference optimizations Qualifications: - Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX) - Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.) - Understanding of GPU architectures or experience with GPU kernel programming using CUDA - Experience with deploying reliable, distributed, real-time model serving at scale is a plus. Compensation & Benefits: - Cash compensation range: $190,000 - $250,000 (Final offer amounts determined by multiple factors including experience and expertise). - Equity may be part of the total compensation package. - Comprehensive health, dental, and vision insurance for you and your dependents. Includes a 401(k) plan.
Post Date:
May 20, 2025