Senior Software Engineer, Platform

Plotly (Headquarters: Remote)

Location: Remote, Canada or US   |   Full-Time
Golang Kubernetes AWS Pulumi CI/CD OpenTelemetry Honeycomb Platform Infrastructure Cloud Backend API PaaS DevOps Automation Security Reliability Scalability Observability SaaS Dash Python Data Visualization Back End Engineer Staff Engineer
As a company with roots in the open-source community, Plotly introduced web-based data visualization to Python. Today, the company offers Dash Enterprise, which provides the best software tools and platform to enable every enterprise in the world to build and scale data applications quickly and easily. Plotly combines cutting-edge technology with a collaborative environment to help data scientists, engineers, and analysts across the world achieve their goals. Founded by innovators and driven by our community of users and customers, we eagerly tackle every challenge, from crafting state-of-the-art UI for seamless data interaction to optimizing our graphing libraries and services for highly reliable performance. We are a tight-knit and quickly growing team where each member can make an immediate, meaningful impact. We take on complex problems, work hard, and are firm believers in the open-source mission. At Plotly, you'll work alongside a diverse team of first-class engineers, developers, scientists, and builders that challenge the status status quo and set a high bar. We encourage each member of our team to explore and expand their skill sets continually, and to approach every problem with curiosity and an open mind. Together, we make it possible for people everywhere to share data and insights that make real impacts in business and around the world.

Join Plotly at the intersection of infrastructure, cloud services, and scalable API backends. As a Senior Platform Engineer, you will help build and evolve Plotly Cloud—our Platform-as-a-Service (PaaS) for deploying Dash apps. You will work across cloud infrastructure (AWS) and Kubernetes-native resources (e.g., API Gateways and custom Controllers). Your responsibilities include maintaining and upholding high standards for security, reliability, and performance within the PaaS. You will design, implement, and manage CI/CD pipelines for efficient and reliable software delivery and deployment with minimal downtime. Automate dev, staging, and production infrastructure provisioning, configuration, and management. Implement, test, and maintain robust disaster recovery strategies to ensure rapid recovery from production outages. Automate deployment rollback mechanisms for problematic deployments. Provide operational support, ensuring stability and availability. Develop and implement automated testing strategies, including smoke tests and end-to-end (E2E) tests, to act as quality gates for continuous delivery. You will collaborate with cross-functional teams (QA, Product) to define requirements, troubleshoot issues, and ensure smooth releases. Contribute to the evolution of the platform architecture, focusing on scalability, resilience, and security. Participate in refining development workflows and advocate for best practices in coding, testing, and infrastructure management.

This role requires strong proficiency in Go, with experience building scalable, production-ready backend services and a solid understanding of dependency management and Go modules. Deep knowledge of Kubernetes fundamentals, including Deployments, Services, RBAC, and Namespaces. Hands-on experience with Kubernetes controllers, operators, and extending the Kubernetes API using client libraries. Familiarity with API Gateway implementations within Kubernetes (e.g., Traefik, Kong, Ambassador). Solid grasp of Kubernetes security best practices and real-world implementation. Proven experience building and maintaining CI/CD pipelines and infrastructure automation workflows. Practical experience with Pulumi or Terraform for managing cloud infrastructure. Good knowledge of key AWS services: EKS, ECS, RDS, ALB, VPC, S3, SQS. Deep understanding of cloud security and networking principles. Knowledge of automated testing practices (e.g., smoke, E2E) integrated into delivery pipelines. Comfort in supporting and troubleshooting issues in live SaaS production environments. Experience with observability tools (OpenTelemetry, Honeycomb).

The ideal candidate brings a strong quality mindset and thrives in environments that demand excellence in security, reliability, and scalability. You should have demonstrated ability to contribute to technically complex projects and drive them to completion. Strong communication skills and a collaborative mindset are essential to work effectively across teams. A quality-first approach with a passion for building secure, reliable, and scalable systems. End-to-end ownership mindset, from design and implementation to deployment and observability. Nice-to-haves include experience designing and building Platform-as-a-Service (PaaS) products, a passion for mentoring others and sharing technical knowledge, and active participation in the cloud-native ecosystem (e.g., contributing to CNCF projects or developing custom Kubernetes operators). A security-first mindset, with a deep understanding of advanced Kubernetes security practices, is also a plus. Even if you don't meet every requirement, Plotly encourages you to apply, as diverse perspectives drive innovation.
Post Date: May 30, 2025