Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Docker @ 6 Kubernetes @ 4 Linux @ 4 DevOps @ 4 Python @ 4 GitHub @ 4 CI/CD @ 4 ArgoCD @ 7 Data Science @ 4 Bash @ 4 Communication @ 4 Helm @ 6 SRE @ 4Details
DGX Cloud at NVIDIA provides serverless generative AI infrastructure and seeks a DevOps Engineer focused on development infrastructure, CI/CD, testing frameworks, and Kubernetes-based cluster automation. The team ensures timely, quality-assured releases and supports developer tooling, CI pipelines, and end-to-end testing environments.
Responsibilities
- Provide both development and operational tooling critical to DGX Cloud services.
- Implement and operate services used by engineering, including first-level on-call/support.
- Maintain a well-optimized and supported paved-road SDLC, collaborating across engineering, testing, and SRE to ensure tool alignment.
- Ensure testing coverage from unit testing to CI, smoke-testing, and full end-to-end testing.
- Provide developer environments that are easily updated with a low barrier to entry.
- Develop and maintain continuous integration pipeline templates and testing frameworks.
- Provide and operate continuous testing end-to-end integration environments.
- Automate deployment, configuration, and management of Kubernetes (K8s) components.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field (or equivalent experience).
- 5+ years of experience developing DevOps tooling with a strong passion for automation.
- Solid background in modern source control platforms (GitHub / GitLab).
- Strong experience in modern CI/CD technologies (GitLab, testing frameworks, ArgoCD).
- Proficient in container-based infrastructure (Docker, Kubernetes, Helm).
- Comprehensive experience with Linux distributions (Ubuntu).
- Solid background in scripting languages (Bash, Python).
- Working experience in higher-level languages (Golang).
- Excellent written and verbal communication skills.
Ways to stand out
- Experience scaling DevOps practices across cross-functional teams.
- Demonstrated ability to handle sophisticated technical environments while meeting or exceeding security, reliability, scalability, and availability metrics.
- Strong and confirmed knowledge of modern architectures at scale.
Compensation & Benefits
- Base salary ranges: Level 3: 144,000 USD - 230,000 USD; Level 4: 168,000 USD - 270,250 USD. Base salary will be determined based on location, experience, and pay of employees in similar positions.
- Eligible for equity and company benefits.
Additional Information
- Applications for this job will be accepted at least until October 10, 2025.
- NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.