Senior Cloud Platform Software Engineer

at Nvidia
USD 224,000-356,500 per year
SENIOR
āœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kubernetes @ 4 IaC @ 4 Distributed Systems @ 4 Networking @ 4 API @ 4 GPU @ 4

Details

NVIDIA is building a best-in-class cloud offering for AI workloads and managed services under the DGX Cloud umbrella. The team is focused on delivering scalable managed self-service APIs and foundational elements of high-performing cloud services for AI and high-performance computing. This role is a founding-member opportunity to build at the intersection of scalable fault-tolerant cloud services and AI.

Responsibilities

  • Build and design platforms for DGX Cloud services as part of the service team
  • Combine best practices from HPC and Kubernetes to create a unified platform
  • Collaborate with software engineers, product teams, and engineering teams across NVIDIA on DGX Cloud AI Compute services
  • Write Infrastructure as Code (IaC), work on Kubernetes, and help design and implement release pipelines
  • Collaborate to understand and apply GitOps and pipelines effectively

Requirements

  • BS in Computer Science, Information Systems, Computer Engineering, or equivalent experience
  • Solid technical foundation in distributed computing and storage, including substantial experience with server systems, storage, I/O, networking, and system software
  • 12+ years of platform engineering experience on large-scale production systems
  • Kubernetes and Infrastructure as Code (IaC) expertise as an engineer
  • Ability to understand and communicate complex designs, distributed infrastructure, and requirements to peers, customers, and vendors
  • General shared storage knowledge such as NFS, LustreFS, GlusterFS
  • Familiarity with system-level architecture (interconnects, memory hierarchy, interrupts, memory-mapped I/O)

Ways to stand out

  • Proven experience in high performance computing, deep learning, and/or GPU-accelerated computing domains
  • Large-scale distributed systems, HPC, ML and training experience with Slurm and Kubernetes
  • Deep knowledge of both software and hardware in HPC and ML infrastructure

Company and other information

NVIDIA is a leader in AI, High-Performance Computing, and Visualization. The company emphasizes creativity, discovery, and building products and services centered around GPU technology. NVIDIA is an equal opportunity employer and values diversity.

Compensation & Benefits

  • Base salary range: 224,000 USD - 356,500 USD (determined based on location, experience, and pay of employees in similar positions)
  • Eligible for equity and benefits (see NVIDIA benefits)

Other

  • Applications accepted at least until September 29, 2025.