Senior Software Engineer - Accelerated Kubernetes Runtime Team
at Nvidia
USD 184,000-356,500 per year
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Security @ 3
Go @ 7
Kubernetes @ 4
Distributed Systems @ 7
Helm @ 4
API @ 4
GPU @ 4
AI @ 4
HPC @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Join NVIDIA's Accelerated Kubernetes Runtime team and be at the forefront of building the next generation of GPU-accelerated Kubernetes runtime distributions. As a Software Engineer on the Runtime team, you will design and build automation systems that enable engineers to seamlessly install, upgrade, and manage cluster runtime packages powering NVIDIA's AI Accelerators. You'll work on controller systems that optimize runtime components for the latest GPU architectures (including GB200/GB300, Vera Rubin and beyond), ensuring reliable, secure, and performant infrastructure for AI researchers and developers.
Responsibilities
- Design and implement runtime features that orchestrate the lifecycle of runtime components across thousands of Kubernetes clusters without manual intervention.
- Build and maintain systems that configure, package, validate, and distribute accelerated compute components.
- Develop Kubernetes controllers, CustomResourceDefinitions (CRDs), and operators that automate runtime installation, upgrade, and rollback operations with API-driven workflows.
- Create automation-first, self-service tooling to minimize manual effort while enhancing reliability and reproducibility.
Requirements
- Bachelor’s degree in Computer Science or equivalent experience.
- 8+ years of professional software engineering experience, with at least 3 years of Kubernetes development experience.
- Experience building production Kubernetes systems with strong expertise in controllers, operators, and CRDs.
- Strong proficiency in Go and experience building scalable Go services that manage complex distributed systems.
- Hands-on experience with Helm, Kustomize, and managing Kubernetes manifest packaging and templating.
- Demonstrated ability to design and implement automation systems that replace manual processes with reliable, self-service tooling.
Ways to stand out
- Experience with NVIDIA Kubernetes components such as GPU Operator, device plugins, or other HPC components in large-scale production environments.
- Familiarity with OCI registries, artifact signing, SBOM generation, and supply chain security practices.
- Experience building multi-tenant platform services with emphasis on API design, versioning, and backward compatibility.
- Track record of migrating legacy systems to modern, automated platforms while maintaining zero-downtime operations, and contributions to upstream Kubernetes/CNCF projects or extending Kubernetes API machinery.
- Deep understanding of Kubernetes architecture including API machinery, admission controllers, and resource lifecycle management.
Compensation & benefits
- Base salary range (varies by level and location):
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
- Eligible for equity and benefits.
Additional information
- Applications accepted at least until April 11, 2026.
- NVIDIA uses AI tools in its recruiting processes and is an equal opportunity employer committed to diversity.