Senior Site Reliability Engineer, Infrastructure

Fleetio (Headquarters: Remote (USA, Canada, Mexico))

Location: Remote (USA, Canada, Mexico)   |   Full-Time
SRE Site Reliability Engineer Infrastructure DevOps AWS Kubernetes Ruby Ruby-on-Rails Terraform Datadog APM Monitoring Observability Performance Scalability Cloud IaC Kafka Redis Elasticsearch TimescaleDB Back End Engineer
**Company Overview:**
Fleetio is a modern SaaS platform helping thousands of organizations manage fleet operations. Founded in 2012, Series D ($450M March 2025), 6k+ customers, $60M+ ARR, 325+ employees. Remote-first since 2012. Proud founding member of the Rails Foundation. Benefits: Competitive pay + equity + annual bonus, 401(k) + match, health/vision/dental, 4 weeks PTO, wellness/professional development funds, equipment stipend + more.

**Role Overview:**
Join the Platform Engineering team to ensure Fleetio applications are highly available, scalable, and performant. Impact customers, engineers, and the business directly. Manage cloud infrastructure, scale the Ruby on Rails stack, implement monitoring, review code for performance, debug production issues, and automate infrastructure.

**Who You Are:**
An Infrastructure Engineer experienced in scaling Ruby on Rails applications, passionate about optimization and performance. Strong background in SRE/Infrastructure Engineering for Rails. Follow Agile/DevOps principles, influence teams effectively, excellent problem-solver in a fast-paced environment. (Mention "coffee" in your application).

**Your Impact (Responsibilities):**
* Manage cloud infrastructure using Infrastructure as Code (IaC).
* Manage and scale a Ruby on Rails stack.
* Implement monitoring tools to improve observability.
* Perform code reviews focusing on performance requirements.
* Debug production issues across all stack levels.
* Plan growth, optimize, and automate Fleetio’s Infrastructure.

**Your Experience (Requirements):**
* 5+ years of AWS Experience.
* 3+ years Kubernetes Experience.
* Ruby on Rails experience.
* Expert at profiling and benchmarking source code.
* Effective at code review, identifying potential performance problems.
* Experience with Datadog or other APM tools.
* Excellent written and verbal communication skills.

**Considered a Plus:**
* Infrastructure as Code tools (Terraform).
* Deep understanding of cloud network fundamentals (routing, firewalls, load balancers, CDNs, VPCs, etc.).
* Experience with distributed event/data stores (Kafka, Redis, Elasticsearch, Memcached, TimescaleDB).
* Knowledge of the fleet management industry.
Post Date: April 17, 2025