We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Staff Infrastructure Engineer
Location: Remote, U.S
|
Full-Time
|
$120,000 -
$250,000
Infrastructure
SRE
Site Reliability Engineer
Golang
Kubernetes
distributed systems
automation
configuration management
Linux networking
CNI
latency sensitive workloads
monitoring
performance
reliability
oncall
incident management
webRTC
Real-time communications
Back End Engineer
Staff Engineer
Company: LiveKit builds open-source APIs to power the future of computing. We are a company of engineers building software stacks for other engineers. Passion for building something truly impactful. Remote company, first principles, global presence! LiveKit is on a mission to help developers create and scale real-time experiences. Role: We are hiring a Site Reliability Engineer to help manage and scale the core components of the LiveKit infrastructure. Visibility, performance, and reliability of our globally distributed architecture is critical and a top priority. Responsibilities: - Build and own the foundational infrastructure that our products run upon. - Work directly on our products' golang code base to implement SRE related objectives. - Take a data driven approach to quantifying system performance and reliability and use it to drive project priorities. - Oncall participation including leading incident management for complex situations. - Work on automation and advanced configuration management to allow our team to manage large numbers of clusters distributed across the world running various products. - Work with infrastructure vendors when their solutions aren't meeting our real time performance and reliability needs. Technical Skills: - Experience managing complex multi-region distributed systems running on top of container orchestration systems like Kubernetes. - Experience with Linux networking, overlay networks, and Kubernetes CNIs. - Low level knowledge for troubleshooting and tuning latency sensitive workloads. - Golang proficiency. Ideal Candidate: - A balance of strengths in both software engineering and large scale system administration. - Passionate about maintainability and keeping system complexity at bay, but able to balance this with meeting launch deadlines. - Incident management training and experience being an Incident Commander (Bonus).
Post Date:
April 24, 2025