Staff Software Engineer

 

Description:

We are seeking a Staff DevOps Engineer to drive the reliability, scalability, and performance of our systems. In this role, you will architect and support a large-scale EKS environment powering internal applications and firmware development. You will collaborate with cross-functional teams to evolve our real-time monitoring, logging, and alerting solutions while optimizing the entire lifecycle of our services, from inception and design to deployment and refinement

Responsibilities
 

  • Resource Optimization: Oversee the lifecycle management of cloud resources, leveraging advanced orchestration techniques to improve efficiency and scalability.
  • Observability & Monitoring: Optimize our observability and telemetry platforms focused on real-time performance monitoring, logging, and alerting using tools like Prometheus, Grafana, and OpenTelemetry.
  • Operational Excellence: Maintain and enhance systems post-deployment by monitoring system health, optimizing availability and latency, and ensuring operational reliability.
  • Scalable Automation: Implement automation solutions to scale systems sustainably while driving improvements in reliability and deployment velocity.
  • Incident Response: Participate in on-call rotations to support production systems, handle incidents with a sustainable response process, and perform blameless postmortems to refine workflows.
  • Tooling & Platforms: Develop and maintain tools, platforms, and self-service frameworks with a user-centric approach to enhance internal team productivity and operational efficiency.
     

Qualifications
 

  • Educational Background: Bachelor's degree in Computer Science, a related technical field, or equivalent experience.
  • Infrastructure Expertise: 5+ years of experience with infrastructure automation, distributed systems, and production-grade private or public cloud systems.
  • Observability & Telemetry: Proven track record in implementing and supporting observability platforms using tools like Grafana, Prometheus, and OpenTelemetry.
  • Cloud & Kubernetes Knowledge: Deep understanding of Kubernetes (e.g., EKS), ArgoCD, Crossplane, and multi-cloud platforms.
  • Programming Skills: Proficiency in Python or Go for building automation and operational tools.
  • Linux & Networking Proficiency: Expertise in Linux systems, networking concepts, and containerization technologies.
  • Problem Solving & Ownership: A systematic approach to debugging and optimizing systems with a strong sense of ownership and attention to detail.

Organization Rivian and Volkswagen Group Technologies
Industry IT / Telecom / Software Jobs
Occupational Category Staff Software Engineer
Job Location Irvine,USA
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Experienced Professional
Experience 5 Years
Posted at 2025-03-08 7:52 pm
Expires on 2025-04-22