Used Tools & Technologies
Not specified
Required Skills & Competences ?
Kubernetes @ 4 IaC @ 4 Networking @ 4 GPU @ 4Details
NVIDIA is building a cloud offering for AI workloads and managed DGX Cloud services. The team focuses on scalable, fault-tolerant cloud services for AI and high-performance computing. This role is for a Cloud Platform Engineer who will drive technical design and build foundational elements of high-performing cloud services.
Responsibilities
- Build and design platforms for DGX Cloud services as part of the service team
- Combine best practices from HPC and Kubernetes to create a unified platform
- Collaborate with software engineers, product teams, and engineering teams across NVIDIA on DGX Cloud AI Compute services
- Write Infrastructure-as-Code (IaC) and work on Kubernetes
- Design and implement release pipelines
- Collaborate to make the best use of GitOps and pipelines
Requirements
- BS in Computer Science, Information Systems, Computer Engineering or equivalent experience
- Solid technical foundation in distributed computing and storage, including substantial experience with server systems, storage, I/O, networking, and system software
- 12+ years of platform engineering experience on large-scale production systems
- Kubernetes and IaC expertise as an engineer
- Ability to understand and communicate complex designs, distributed infrastructure, and requirements to peers, customers, and vendors
- General shared storage knowledge such as NFS, LustreFS, GlusterFS
- Familiarity with system-level architecture, including interconnects, memory hierarchy, interrupts, and memory-mapped I/O
Preferred / Ways to stand out
- Proven experience in high performance computing (HPC), Deep Learning, and/or GPU-accelerated computing domains
- Large-scale distributed system, HPC, ML and training experience with Slurm and Kubernetes
- Deep knowledge of both software and hardware in HPC and ML infrastructure
Compensation & Benefits
- Base salary range: 224,000 USD - 356,500 USD (determined by location, experience, and comparable pay)
- Eligible for equity and company benefits
Other information
- Applications accepted at least until September 22, 2025
- NVIDIA is an equal opportunity employer committed to diversity and non-discrimination.