Cloud Platform Software Engineer – Platform APIs

at Nvidia
USD 184,000-287,500 per year
MIDDLE
✅ Hybrid

Used Tools & Technologies

Machine Learning

Required Skills & Competences

Software Development @ 6 Go @ 3 Kubernetes @ 3 IaC @ 3 Terraform @ 3 Python @ 3 GCP @ 3 CI/CD @ 3 Distributed Systems @ 3 AWS @ 3 Azure @ 3 API @ 3 GPU @ 3 Observability @ 3 AI @ 3

Details

Are you passionate about Kubernetes and AI and want to help build the best platform for ML/AI infrastructure? Do you thrive when your work directly empowers teams to push the boundaries of what's possible? We're the Platform API team within NVIDIA's DGX Cloud organization - a collaborative group of cloud platform engineers, architects, and SREs who are passionate about building and nurturing the declarative, Kubernetes-native control plane that powers GPU-accelerated infrastructure across multiple cloud providers. Together, we're empowering the world's leading AI teams to train and deploy at datacenter scale.

We design and extend Kube-like APIs, and we craft Go-based reconciliation controllers that thoughtfully turn high-level intent into production-ready AI infrastructure. We take pride in owning our code end-to-end, and we care deeply about the full lifecycle of multi-cloud GPU clusters, from customer onboarding and provisioning through upgrades and decommissioning. We partner closely with our runtime, cloud architecture, observability, and storage teams to solve sophisticated distributed systems challenges together. As a team, we're strengthening NVIDIA's approach to Cloud Native development.

Responsibilities

  • Develop software systems to support large scale deployments of cloud infrastructure
  • Design and develop APIs to support Infrastructure as Code (IaC) automation and deployment workflows
  • Contribute to multiple source code projects to fulfill NVIDIA requirements with software services
  • Work and collaborate with engineering managers, architects, designers, and frontend engineers to deliver high quality software
  • Automate the validation of software solutions with unit and integration tests
  • Participate in the ownership and health of CI/CD pipelines from dev to production environments
  • Collaborate with other specialists for feedback on proposed designs and product direction
  • Openly share successes and failures in a no blame environment

Requirements

  • BS in Computer Science, Information Systems, Computer Engineering or equivalent experience
  • 8+ years of proven experience in large scale software development
  • Experience building and shipping services on Kubernetes
  • Background with using and contributing to open-source projects
  • Experience collaborating with teams to write software to support cloud services at scale
  • Programming experience in a relevant language, e.g. Golang, Python
  • Ability to communicate design and quality strategy in written, visual, and oral formats
  • Experience with a wide range of modern infrastructure tools and technologies

Ways to stand out from the crowd (Preferred)

  • Experience with Kubernetes Cluster API, Terraform, Tinkerbell, and other infrastructure tooling
  • Practical experience with Azure, GCP, or AWS
  • Capable of refactoring software to run in systems such as Kubernetes
  • Ability to discuss and work with CSI, CNI, and CRI and/or familiarity with the CNCF and the tooling across the ecosystem
  • Upstream contribution in open source projects

Benefits

  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD.
  • You will also be eligible for equity and benefits (link to NVIDIA benefits referenced in original posting).

Additional information

  • #LI-Hybrid
  • Applications for this job will be accepted at least until June 14, 2026.
  • This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.
  • NVIDIA is an equal opportunity employer and is committed to fostering an inclusive work environment.