Infrastructure Engineer

 

Description:

We are seeking an experienced Infrastructure Engineer to architect, deploy, and maintain a high-performance computing environment that supports foundational models for Earth systems. The infrastructure you design will provide critical services that sectors rely on for operational continuity and disaster preparedness. You'll work with cutting-edge technologies to build resilient, scalable systems capable of handling massive computational workloads, delivering reliable predictions for infrastructure operators in various sectors.

 

Key Responsibilities

  • GPU Infrastructure Management: Design, deploy, and maintain GPU clusters to support high-performance computing needs. Ensure optimal performance and availability of GPU servers.
  • High-Availability System Design: Build infrastructure with redundancy and failover capabilities suitable for services relied upon by critical sectors like energy, transportation, and emergency services.
  • Distributed Computing at Scale: Maintain and optimize Kubernetes and Ray clusters for distributed machine learning inference and large-scale data-processing workloads.
  • Data Management: Design and maintain data storage solutions for petabyte-scale weather and simulation data, focusing on accessibility and performance.
  • Infrastructure as Code (IaC): Utilize automation tools to provision, scale, and manage infrastructure, ensuring cloud environments are secure, scalable, and reliable.
  • Security and Compliance: Enforce security best practices, including network policies, access controls, and vulnerability management across the infrastructure.
  • Continuous Integration/Continuous Deployment (CI/CD): Collaborate with development teams to implement CI/CD pipelines, automating testing and deployment to improve software delivery.
  • Monitoring and Optimization: Implement monitoring solutions to track system performance, identify bottlenecks, and proactively resolve potential issues.

 

Qualifications

Experience: Proven experience in infrastructure engineering and automation.

 

Technical Skills:

  • Proficiency in infrastructure automation tools and principles.
  • Strong experience with container orchestration platforms like Kubernetes.
  • Hands-on experience with distributed computing frameworks (e.g., Ray).
  • Familiarity with major cloud providers and their high-performance computing offerings.
  • Knowledge of monitoring and observability best practices.
  • Familiarity with CI/CD tools and practices.
  • Solid understanding of networking concepts and security best practices.

Organization Enigma
Industry Engineering Jobs
Occupational Category Infrastructure Engineer
Job Location Seattle,USA
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Intermediate
Experience 2 Years
Posted at 2025-03-24 7:59 am
Expires on 2025-05-08