Senior Cloud Platform Software Engineer

at Nvidia

📍 Seattle, United States

USD 224,000-356,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Kubernetes @ 4 IaC @ 4 Networking @ 4 GPU @ 4

Details

NVIDIA is building a cloud offering for AI workloads (DGX Cloud) and seeks a Cloud Platform Engineer to drive technical design and build foundational elements of high-performing cloud services for AI and high-performance computing. This is an opportunity to be a founding member of a team working at the intersection of scalable, fault-tolerant cloud services and AI.

Responsibilities

Build and design platforms for DGX Cloud services as part of the service team.
Combine HPC and Kubernetes best practices to help create a unified platform.
Collaborate with software engineers, product teams, and engineering teams across NVIDIA on DGX Cloud AI Compute services.
Write Infrastructure as Code (IaC), work on Kubernetes, and help design and implement release pipelines.
Collaborate on using GitOps and Pipelines effectively.

Requirements

BS in Computer Science, Information Systems, Computer Engineering, or equivalent experience.
Solid technical foundation in distributed computing and storage, including substantial experience with server systems, storage, I/O, networking, and system software.
12+ years of platform engineering experience on large-scale production systems.
Kubernetes and IaC expertise as an engineer (experience with Kubernetes concepts such as Pod Disruption Budgets is called out).
Ability to understand and communicate complex designs, distributed infrastructure, and requirements to peers, customers, and vendors.
General shared storage knowledge such as NFS, LustreFS, GlusterFS.
Familiarity with system-level architecture, such as interconnects, memory hierarchy, interrupts, and memory-mapped I/O.

Ways to stand out / Preferred

Proven experience in high-performance computing (HPC), Deep Learning, and/or GPU-accelerated computing domains.
Large-scale distributed system, HPC, ML and training experience with Slurm and Kubernetes.
Deep knowledge of both software and hardware in HPC and ML infrastructure.

Compensation & Benefits

Base salary range: 224,000 USD - 356,500 USD (will be determined based on location, experience, and pay of employees in similar positions).
Eligible for equity and benefits (link to NVIDIA benefits provided in original posting).

Additional Information

Applications accepted at least until September 22, 2025.
NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.