Location: NY, SF, or Remote   |   Full-Time
infrastructure engineer devops sre cloud aws gcp azure kubernetes docker terraform pulumi ci/cd monitoring logging security python bash networking saas ai ml Back End Engineer
**Company:** Roboflow provides the end-to-end platform for computer vision, serving over 1M developers and processing millions of images. We're backed by GV ($40M Series B) and scaling rapidly.

**Role:** As an Infrastructure Engineer, you will be responsible for designing, building, scaling, and maintaining the cloud infrastructure that powers the Roboflow platform. You'll ensure our systems are reliable, scalable, secure, and cost-effective, enabling our team to ship features quickly and our users to train and deploy models effectively.

**Responsibilities:**
*   Design, implement, and manage Roboflow's cloud infrastructure (primarily AWS/GCP/Azure).
*   Build and maintain CI/CD pipelines for automated testing and deployment.
*   Manage container orchestration systems (e.g., Kubernetes).
*   Implement and manage monitoring, logging, and alerting systems.
*   Ensure infrastructure security and compliance (e.g., SOC2).
*   Optimize infrastructure for performance and cost-efficiency.
*   Automate infrastructure provisioning and management (Infrastructure as Code, e.g., Terraform, Pulumi).
*   Support engineering teams with infrastructure needs and best practices.

**Ideal Candidate:**
*   Proven experience as an Infrastructure Engineer, DevOps Engineer, or SRE.
*   Deep understanding of cloud platforms (AWS, GCP, or Azure).
*   Strong experience with containerization (Docker) and orchestration (Kubernetes).
*   Experience with Infrastructure as Code tools (Terraform, Pulumi, CloudFormation).
*   Proficiency in scripting languages (e.g., Python, Bash).
*   Experience with CI/CD tools (e.g., GitHub Actions, GitLab CI, Jenkins).
*   Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, ELK stack).
*   Strong understanding of networking, security, and database concepts.
*   Experience managing infrastructure for ML/AI workloads is a plus.
Post Date: April 21, 2025