Senior Software Engineer, AI Infrastructure
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Docker @ 4 Kubernetes @ 4 Terraform @ 4 Python @ 4 CI/CD @ 4 Distributed Systems @ 4 AWS @ 4 Azure @ 4 Communication @ 4 Helm @ 4 OpenStack @ 4 Load Testing @ 4 IaaS @ 4 Rust @ 4 Oracle @ 4 OpenShift @ 4 GPU @ 4Details
NVIDIA DGX Cloud is a managed, multi-cloud AI supercomputing service that provides enterprises with instant access to NVIDIA's high-performance AI infrastructure and software, including dedicated DGX AI supercomputing clusters, optimized software stacks, and expertise. The platform enables users to rapidly build, train, and deploy large-scale AI models across leading cloud providers like Oracle, Azure, and Google Cloud, eliminating the complexity of managing their own infrastructure. Key features include pre-trained and fine-tunable models, serverless GPU inference, and a unified interface for multi-cloud management.
You will join the DGX Engineering Team as a Senior Software Engineer. In this role you will help craft and guide the future of AI & GPUs in the Cloud, building cloud-scale software systems and GPU-powered services.
Responsibilities
- Design, build, and implement scalable cloud-based systems for PaaS/IaaS.
- Build RESTful cloud services and virtualization frameworks that form the NVIDIA DGX Cloud Reference Architecture.
- Work closely with other teams on new products or enhancements to existing products.
- Drive performance tuning and automation for high security and maximum performance to support extensive AI workloads.
- Support, maintain, and document software functionality.
Requirements
- Expertise in Kubernetes (K8s) and KubeVirt.
- Expertise in virtualization technologies such as Firecracker, KVM, OpenStack, Nutanix AHV, and Red Hat OpenShift.
- Extensive experience with Golang and building RESTful web services.
- Demonstrated understanding of cloud design in areas of virtualization, global infrastructure, distributed systems, and security.
- Experience with Docker and containers.
- Background with Infrastructure as Code.
- Experience with AWS (examples listed: Fargate, EC2, IAM, ECR, EKS, Route53).
- Experience with Continuous Integration and Continuous Delivery (CI/CD).
- BS or MS in Computer Science or equivalent experience with over 12+ years of hands-on software engineering.
- Excellent interpersonal and written communication skills.
Ways to stand out
- Experience with Postgres.
- Exposure to Helm charts and Terraform.
- A track record of solving complex problems with elegant solutions.
- Prior experience with Rust and Python and demonstrated delivery of complex projects in previous roles.
- Experience with load testing frameworks and secrets management.
Compensation & Benefits
- Base salary range: 224,000 USD - 356,500 USD for Level 5; 272,000 USD - 425,500 USD for Level 6.
- Eligible for equity and company benefits (link to NVIDIA benefits provided in original posting).
Application & Other
- Applications accepted at least until September 22, 2025.
- NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. The company does not discriminate on the basis of protected characteristics.