Lead Software Engineer, Fleet Management - DGX Cloud

at Nvidia
USD 224,000-431,200 per year
SENIOR
✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences

Docker @ 4 Go @ 6 Kubernetes @ 4 Linux @ 3 Python @ 6 GCP @ 4 Distributed Systems @ 4 AWS @ 4 Azure @ 4 Communication @ 4 JavaScript @ 3 PostgreSQL @ 4 Next.js @ 3 React @ 3 Angular @ 3 Debugging @ 7 API @ 4 GPU @ 4 AI @ 4 Data Pipelines @ 4

Details

Join NVIDIA's DGX Cloud team to build foundational systems for high-performance GPU infrastructure. You'll play a technical lead role designing scalable cloud services that integrate GPU telemetry from datacenters and enable operational automation across global cloud operations.

Responsibilities

  • Act as technical lead for a team of software engineers designing cloud services backed by databases and data warehouses.
  • Design and develop RESTful APIs to ingest telemetry from AI datacenters.
  • Build scalable cloud services for high-volume ingestion, processing, and storage of large datasets.
  • Build and manage data pipelines for online and offline data storage.
  • Collaborate across teams to codify business processes into scalable, self-measuring systems.
  • Optimize the reliability and efficiency of cloud services and operations.
  • Lead and ship impactful technical projects, ensuring quality and scalability at every stage.

Requirements

  • At least 12+ years of industry experience with a Bachelor’s or Master’s degree (or equivalent experience); PhD preferred.
  • Expertise in building scalable REST APIs backed by PostgreSQL-compatible data stores.
  • Proficiency in programming languages such as Go or Python.
  • Familiarity with modern JavaScript frameworks (for example, React, Angular, Next.js).
  • Expertise in cloud infrastructure (AWS, GCP, Azure) and container technologies like Docker and Kubernetes.
  • Expertise with high-scale distributed systems, including architectural patterns for APIs and data pipelines.
  • Outstanding communication and collaboration skills focused on solving complex operational challenges.
  • Familiarity with Linux operating systems.

Ways to Stand Out

  • Track record of leading engineers to successful delivery and operations of high-performance cloud services at Internet scale.
  • Experience operating NVIDIA datacenter GPUs.
  • Strong debugging and problem-solving skills in distributed environments.

Compensation & Additional Information

  • Base salary ranges by level: Level 5 — 224,000 USD to 356,500 USD; Level 6 — 272,000 USD to 431,250 USD.
  • Eligible for equity and benefits.
  • Applications accepted at least until April 24, 2026.
  • NVIDIA uses AI tools in its recruiting processes and is an equal opportunity employer committed to diversity.