Senior Cloud Platform Software Engineer
at Nvidia
π Seattle, United States
USD 224,000-356,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Kubernetes @ 4 IaC @ 4 CI/CD @ 4 Distributed Systems @ 4 Networking @ 7 GPU @ 4Details
NVIDIA is building a cloud offering for AI workloads (DGX Cloud) and seeks a Cloud Platform Engineer to design and implement foundational, high-performing cloud services for AI and HPC. This role is for engineers passionate about Infrastructure as Code (IaC), Kubernetes, GitOps, and scaling distributed systems to production.
Responsibilities
- Build and design platforms for DGX Cloud services as part of the service team.
- Combine best practices from HPC and Kubernetes to create a unified platform.
- Collaborate with software engineers, product teams, and engineering teams across NVIDIA on DGX Cloud AI compute services.
- Write Infrastructure as Code and work on Kubernetes-based platforms.
- Design and implement release pipelines and collaborate on GitOps and CI/CD pipelines.
- Drive technical design and foundational elements of scalable, fault-tolerant cloud services for AI.
Requirements
- BS in Computer Science, Information Systems, Computer Engineering, or equivalent experience.
- Strong technical foundation in distributed computing and storage, including experience with server systems, storage, I/O, networking, and system software.
- 12+ years of platform engineering experience on large-scale production systems.
- Demonstrated expertise with Kubernetes and Infrastructure as Code as an engineer.
- Ability to understand and communicate complex designs, distributed infrastructure, and requirements to peers, customers, and vendors.
- General shared storage knowledge (NFS, LustreFS, GlusterFS, etc.).
- Familiarity with system-level architecture (interconnects, memory hierarchy, interrupts, memory-mapped I/O).
Ways to stand out
- Proven experience in high-performance computing, deep learning, and/or GPU-accelerated computing domains.
- Large-scale distributed system, HPC, ML and training experience with Slurm and Kubernetes.
- Deep knowledge of both software and hardware in HPC and ML infrastructure.
Compensation & Benefits
- Base salary range: 224,000 USD - 356,500 USD (final base determined by location, experience, and internal pay benchmarks).
- Eligible for equity and benefits (see NVIDIA benefits page).
- Applications accepted at least until September 29, 2025.
About NVIDIA
NVIDIA leads developments in AI, High-Performance Computing, and Visualization. The company emphasizes GPU-accelerated computing and is committed to diversity and equal opportunity employment.