Used Tools & Technologies
Not specified
Required Skills & Competences ?
System Administration @ 3 Ansible @ 6 Docker @ 5 Grafana @ 3 Kubernetes @ 6 Linux @ 3 Prometheus @ 3 Terraform @ 6 GCP @ 5 Machine Learning @ 6 MLOps @ 6 Data Science @ 3 Leadership @ 3 AWS @ 5 Azure @ 5 Helm @ 5 Mathematics @ 3 Networking @ 3 Microservices @ 3 Debugging @ 6 Technical Leadership @ 3 GPU @ 5 LLMOps @ 6Details
Do you want to be part of the team that brings Artificial Intelligence (AI) emerging technology to the field? We are looking for a hardworking Solution Architect (SA) to join the DGX Cloud SA Segment Team. The mission of the DGX Cloud Segment team is to guide and enable the successful adoption at scale of DGX Cloud and NVIDIA AI Enterprise Software in production.
NVIDIA DGX Cloud is an AI platform for developers, researchers, and enterprises, optimized for the demands of Generative AI. The DGX Cloud SA team is dedicated to shaping the future of DGX Cloud by actively gathering and incorporating partner feedback and product requirements. Our team will help optimize the onboarding process for NVIDIA Cloud Partners, ensuring fast time to insights and exceptional user experience. Additionally, we will collaborate with internal teams to scale expertise and knowledge through training and the creation of repeatable guides. Our focus on building reliable infrastructure, partner qualifications, and assets will streamline onboarding, ultimately increasing adoption of DGX Cloud.
Responsibilities
- Work closely with DGX Cloud Partners; become their trusted technical advisor, advocate for their needs, and ensure they are successful in accomplishing their business goals with the platform.
- Accelerate NVIDIA Cloud Partner onboarding time, cluster manageability and reliability.
- Scale knowledge, reach, and opportunities by building and educating vertical teams and communities on DGX Cloud and NVIDIA Reference Architectures.
- Communicate findings gathered from the field to Reference Architecture teams.
- Provide technical education and facilitate field product feedback to improve DGX Cloud.
- Enable partners to participate in the DGX Cloud Ecosystem with the goal of end-user satisfaction and increased sales.
Requirements
- BS, MS, or Ph.D. in Engineering, Mathematics, Physics, Computer Science, Data Science, or equivalent experience.
- 5+ years of proven experience with one or more Cloud Service Providers (AWS, Azure, GCP, or OCI), NVIDIA Cloud Partners (e.g., CoreWeave, Lambda Labs, Crusoe) and cloud-native architectures and software.
- Demonstrated experience in technical leadership and success in working with customers.
- Expertise with parallel filesystems (e.g., Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects (InfiniBand, Omni-Path, RoCE, Gig-E).
- Strong coding and debugging skills; demonstrated expertise in one or more areas: Machine Learning, Deep Learning, Slurm, Kubernetes, MPI, MLOps, LLMOps, Ansible, Terraform, and other high-performance AI cluster solutions.
- Proficient in deploying GPU applications in Slurm and Kubernetes, and with containers: Docker, Helm, and registries.
- Linux-based configuration management and monitoring solutions; system administration, OS installation, configuration, and troubleshooting.
- Networking technologies for complex infrastructure configuration (router, firewall, load balancer, DNS, VPN).
Ways to stand out
- Experience using DGX Cloud and NVIDIA AI Enterprise Software, including Base Command Manager, NeMo, and NVIDIA's Inference Microservices.
- Experience with AI application development and deployment.
- Background deploying and configuring observability tooling such as Grafana, Prometheus, Weights & Biases (W&B), Nagios, Zabbix.
- Experience with high-performance or large-scale computing environments.
Compensation and benefits
- Base salary range: 148,000 USD - 235,750 USD (determined based on location, experience, and pay of employees in similar positions).
- Eligible for equity and benefits (see NVIDIA benefits).
Additional information
- Applications for this job will be accepted at least until September 5, 2025.
- NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.