Senior Software Engineer, AI Infrastructure

at Nvidia
USD 224,000-425,500 per year
SENIOR
βœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 4 Docker @ 4 Kubernetes @ 4 Terraform @ 4 Python @ 4 CI/CD @ 4 Distributed Systems @ 7 AWS @ 4 Azure @ 4 Communication @ 4 Helm @ 4 OpenStack @ 4 Load Testing @ 4 IaaS @ 4 Rust @ 4 Oracle @ 4 OpenShift @ 4 GPU @ 4

Details

NVIDIA DGX Cloud is a managed, multi-cloud AI supercomputing service that provides enterprises with instant access to NVIDIA's high-performance AI infrastructure and software, including dedicated DGX AI supercomputing clusters, optimized software stacks, and expertise. The platform enables users to rapidly build, train, and deploy large-scale AI models across leading cloud providers like Oracle, Azure, and Google Cloud, eliminating the complexity of managing their own infrastructure. Key features include pre-trained and fine-tunable models, serverless GPU inference, and a unified interface for multi-cloud management.

Role overview

NVIDIA is looking for a passionate member to join the DGX Engineering Team as a Senior Software Engineer. In this role you will play a significant part in shaping the future of AI & GPUs in the Cloud. You will work on cloud-scale software systems and services that deliver GPU-powered services worldwide.

Responsibilities

  • Design, build, and implement scalable cloud-based systems for PaaS/IaaS.
  • Build RESTful cloud services and virtualization frameworks that form the NVIDIA DGX Cloud Reference Architecture.
  • Drive performance tuning and automation for systems that support extensive AI workloads with high security and maximum performance requirements.
  • Work closely with other teams on new products or features and improvements of existing products.
  • Support, maintain, and document software functionality.

Requirements

  • Expertise in Kubernetes (K8s) and KubeVirt.
  • Expertise with virtualization technologies such as Firecracker, KVM, OpenStack, Nutanix AHV, and Red Hat OpenShift.
  • Extensive experience with Golang and building RESTful web services.
  • Strong understanding of cloud design in virtualization, global infrastructure, distributed systems, and security.
  • Experience with Docker and containers.
  • Background with Infrastructure as Code.
  • Experience with AWS services (examples listed: Fargate, EC2, IAM, ECR, EKS, Route53).
  • Experience with Continuous Integration and Continuous Delivery (CI/CD).
  • BS or MS in Computer Science or equivalent experience with over 12+ years of hands-on software engineering.
  • Excellent interpersonal and written communication skills.

Ways to stand out

  • Experience with Postgres.
  • Exposure to Helm charts and Terraform.
  • Prior experience with Rust and Python and delivery of complex projects in previous roles.
  • Experience with load testing frameworks and secrets management.
  • A track record of solving complex problems with elegant solutions.

Compensation & benefits

  • Base salary ranges: 224,000 USD - 356,500 USD for Level 5; 272,000 USD - 425,500 USD for Level 6. Your base salary will be determined based on location, experience, and pay of employees in similar positions.
  • You will also be eligible for equity and benefits.

Additional details

  • Location: Santa Clara, CA, United States.
  • Employment type: Full time.
  • Applications for this job will be accepted at least until September 22, 2025.

Equal opportunity

NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. They do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.