Senior Datacenter System Software Architect - DGX Cloud

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Docker @ 4 Go @ 4 Kubernetes @ 4 Linux @ 4 Terraform @ 4 Python @ 4 Distributed Systems @ 3 Machine Learning @ 3 Data Science @ 4 TensorFlow @ 7 Hiring @ 4 Communication @ 3 Parallel Programming @ 4 Rust @ 4 Microservices @ 4 Debugging @ 7 PyTorch @ 7 CUDA @ 4 GPU @ 7

Details

NVIDIA is hiring engineers to scale up its AI Infrastructure. We expect you to have a strong programming background, a deep understanding of distributed systems, familiarity with software testing and deployment, and excellent communication and planning abilities. We also welcome out-of-the-box thinkers who can provide new ideas with strong at execution bias. Expect to be constantly challenged, improving, and evolving for the better. You and other engineers in this team will help advance NVIDIA's capacity to build and deploy leading infrastructure solutions for a broad range of AI-based applications that affect core data science.

We’re looking for a highly motivated, creative engineer with strong experience in system software to join the DGX Cloud Software Team. You will lead the architecture, design and implementation of our next generation DGX cloud clusters using latest technologies. On this team, you will do full stack deployment including hardware architecture, workload orchestration and application performance tuning.

Are you ready to change the next generation of computing? Join us at the forefront of technological advancement.

Responsibilities

  • Lead technical activities for data centers with focus on hybrid deployments between cloud and on-prem
  • Provide expertise in infrastructure workflows, including hardware, software release, workload orchestration and application tuning
  • Provide fast and creative solutions for complex problems and write effective, clear and reliable architecture specification
  • Translate requirements to vision, architecture and roadmap
  • Work with engineering teams across NVIDIA to ensure your software integrates seamlessly from the hardware all the way up to the AI training applications

Requirements

  • Masters or PhD in Computer Science, Computer Engineering, Physics or equivalent experience
  • 9+ years of experience in this field
  • Coursework or familiarity with Data Sciences, Deep Learning, or Machine Learning
  • Ability to seamlessly shift between Linux system environments to Python programming
  • Programming skills in one or more high-level languages (C, C++, Go, Rust, etc.)
  • System-level experience with both hardware and software
  • Motivated self-starter with strong problem-solving skills and customer-facing communication skills
  • Strong design, coding, analytical, debugging and problem-solving skills
  • Passion for continuous learning and knowledge transfer; ability to work concurrently with multiple groups locally and abroad in the organization

Ways to stand out

  • Experience with GPU deep learning and data sciences; experience using TensorFlow, PyTorch or other deep learning frameworks
  • Experience working with Docker containers, Slurm, Terraform and Kubernetes
  • CUDA programming and NCCL experience
  • HPC programming experience including MPI, OpenACC, or other parallel programming tools
  • Hands-on experience with DGX Cloud, NVIDIA AI Enterprise AI Software, Base Command Manager, NEMO and NVIDIA Inference Microservices
  • Interest in crafting, analyzing and fixing large-scale distributed systems
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive

Compensation & Benefits

  • Base salary range: 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
  • You will also be eligible for equity and benefits (see NVIDIA benefits page)

Additional information

  • Applications for this job will be accepted at least until October 23, 2025.
  • NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. NVIDIA does not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.