Used Tools & Technologies
Not specified
Required Skills & Competences ?
Docker @ 2 Kubernetes @ 2 Linux @ 3 DevOps @ 2 MLOps @ 2 Communication @ 3 Networking @ 3 Debugging @ 3 CUDA @ 3 GPU @ 3Details
You will lead the deployment of innovative AI solutions and accelerated computing infrastructure for NVIDIA customers, working closely with customers to design, deploy, and debug data center GPU server and networking environments.
Responsibilities
- Work with NVIDIA AI Native customers on data center GPU server and networking infrastructure deployments.
- Guide customer discussions on network topologies, compute/storage, and support the bring-up of server/network/cluster deployments.
- Identify new project opportunities for NVIDIA products and technology solutions in data center and AI applications.
- Conduct regular technical meetings with customers as a trusted advisor, discussing product roadmaps, cluster debugging, and new technology introductions.
- Build custom demonstrations and proofs of concept to address critical business needs.
- Analyze and debug compute and network performance issues.
Requirements
- BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, or related fields, or equivalent experience.
- 5+ years of experience in Solution Engineering or similar roles.
- System-level understanding of server architecture, NICs, Linux, system software, and kernel drivers.
- Practical knowledge of networking: switching & routing for Ethernet and InfiniBand, and data center infrastructure (power/cooling).
- Familiarity with DevOps/MLOps technologies such as Docker/containers and Kubernetes.
- Effective time management and ability to balance multiple tasks.
- Excellent communication skills for articulating ideas and code clearly through documents and presentations.
Ways to stand out
- External customer-facing skills and experience.
- Experience with the bring-up and deployment of large clusters.
- Proficiency in systems engineering, coding, and debugging, including C/C++, Linux kernel, and drivers.
- Hands-on experience with NVIDIA systems/SDKs (e.g., CUDA), NVIDIA networking technologies (e.g., DPU or equivalent experience, RoCE, InfiniBand), and/or ARM CPU solutions.
- Familiarity with virtualization technology concepts.
Compensation & Benefits
- Base salary will be determined based on location, experience, and the pay of employees in similar positions.
- Base salary ranges provided: Level 3: 148,000 USD - 235,750 USD; Level 4: 184,000 USD - 287,500 USD.
- You will also be eligible for equity and benefits (see NVIDIA benefits page).
Additional information
- Location: Santa Clara, CA, United States (listed).
- Employment type: Full time.
- Applications for this job will be accepted at least until August 22, 2025.
- NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer.