We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Senior Site Reliability Engineer
Location: Los Angeles, CA (Marina del Rey)
|
Full-Time
|
$150,000 -
$170,000
Site Reliability Engineering
Python
Node.js
Docker
Kubernetes
GCP
AWS
Observability
Infrastructure
Back End Engineer
Staff Engineer
Cyber Security
**About Zefr:** Zefr is a leading global technology company specializing in responsible marketing within walled garden social environments. Our patented AI technology empowers brands and agencies to manage content adjacency on platforms like YouTube, Meta, TikTok, and Snap with precision and transparency. We combine artificial intelligence with deep platform integrations to deliver measurable results while maintaining ethical standards in social advertising. **About The Role:** As a Senior Site Reliability Engineer at Zefr, you will play a critical role in ensuring the stability, scalability, and reliability of our infrastructure. This position requires a blend of technical expertise, leadership skills, and a passion for continuous improvement. You will be responsible for designing, implementing, and maintaining systems that support our AI-powered marketing solutions, with a focus on optimizing performance for large-scale operations. The ideal candidate will thrive in a fast-paced environment, balancing short-term priorities with long-term architectural improvements while mentoring junior engineers and driving innovation in our infrastructure. **Key Responsibilities:** * Lead the design and implementation of highly available, scalable systems supporting Zefr's AI-driven marketing platforms * Optimize infrastructure performance to handle increasing traffic and data loads from major social platforms * Develop and maintain monitoring systems to ensure service reliability and rapid issue resolution * Collaborate with engineering teams to implement "secure by default" architecture principles * Design and deploy fault-tolerant systems with minimal downtime and maximum availability * Mentor junior engineers and share expertise in site reliability engineering best practices * Analyze system performance metrics and drive continuous improvement initiatives * Work closely with data scientists and platform teams to optimize ML infrastructure **Required Skills:** * Proven experience in designing and implementing scalable systems for high-traffic environments * Deep understanding of distributed systems architecture and networking principles * Proficiency with containerization technologies like Docker and Kubernetes * Experience with cloud platforms (AWS, GCP, or Azure) and their respective services * Strong knowledge of monitoring, logging, and observability tools (Prometheus, Grafana, Datadog) * Expertise in automation and infrastructure-as-code practices * Familiarity with GitOps and CI/CD pipelines * Solid grasp of system security principles and best practices * Ability to mentor and guide engineering teams **Technical Stack:** * Cloud Platforms: GCP, AWS * Containerization: Docker, Kubernetes (GKE, EKS) * GitOps: ArgoCD, GitHub Actions * Languages: Python, Node.js * Observability: Prometheus, Grafana, Open Telemetry, Datadog * Infrastructure: Terraform, Bash scripting * Networking: TCP/IP, DNS, Load Balancers **What We Offer:** * Competitive salary ranging from $150,000 to $170,000 * Opportunities to work with cutting-edge AI and cloud technologies * Collaborative environment with cross-functional teams * Regular team events and professional development opportunities * Flexible work arrangements with required on-site presence 2-3 times per week * Comprehensive benefits package including health insurance, retirement plans, and paid time off * Chance to make a significant impact on the infrastructure supporting global marketing solutions **Join Zefr:** If you are passionate about building reliable systems that power innovative marketing solutions, we want to hear from you. As a Senior Site Reliability Engineer, you will have the opportunity to shape Zefr's technical direction while contributing to a diverse and dynamic team. This role offers both technical challenge and professional growth in a company that values innovation and excellence. Apply today and become part of our mission to transform digital marketing through technology.
Post Date:
June 20, 2025