Senior Systems Engineer, Compute Platform - Autonomous Vehicles

at Nvidia

📍 Santa Clara, United States

$180,000-339,200 per year

SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

GCP @ 4 Airflow @ 4 Distributed Systems @ 6 AWS @ 4 Communication @ 7 KubeFlow @ 4

Details

We are looking for Senior Engineers to work on scaling our cloud compute platform for Autonomous Vehicles (AV). Our platform provides access to 100s of PBs of data and exa-scale GPU+CPU compute for various AV workloads including data ingestion, processing and model training. We are embarking on building the next generation of the platform and looking for strong engineers to join us in this journey.

Responsibilities

  • Enhance and scale our compute platform to support diverse workloads on GPUs and CPUs
  • Design and build scalable and distributed services to power large scale workloads
  • Design and build scalable tools to efficiently operate services and hardware clusters
  • Collaborate with multiple teams to understand their needs, and build functionality that improves their user experience and productivity
  • Participate in operations, on-call, and user support

Requirements

  • BS/MS/PhD in Computer Science, Engineering or other technical fields or equivalent experience
  • 6+ years of experience developing and operating backend systems at scale
  • Proficiency in Golang and distributed systems
  • Deep care for user experience
  • Strong collaboration and communication skills
  • You are extremely motivated, highly passionate, curious about and follow state-of-the-art technologies
  • Strong willingness to learn, listen to diverse opinions, and contribute to an inclusive and growth-oriented culture

Ways to stand out from the crowd:

  • Prior background in building AI Infrastructure for Autonomous Vehicles
  • Familiarity with HPC and workload managers (e.g. SLURM)
  • Experience with Workflow orchestration systems (e.g Flyte, Kubeflow pipelines, Airflow)
  • Experience managing and deploying services on the cloud (e.g. AWS, GCP)
  • Open source contributions