Used Tools & Technologies
Not specified
Required Skills & Competences ?
Docker @ 4 Kubernetes @ 4 Linux @ 4 DevOps @ 4 MLOps @ 4 Communication @ 7 Networking @ 4 Product Management @ 4 Debugging @ 4 CUDA @ 4 GPU @ 4Details
NVIDIA is looking for an experienced GPU and network systems Solutions Architect & Engineer. You will be part of a team that brings new Artificial Intelligence (AI) hardware and software technologies to production in customer data centers. As part of the NVIDIA SA organization, you will drive deployment of end-to-end technology solutions integration at strategic customers and offer recommendations to business and engineering teams on product roadmap.
Responsibilities
- Work with NVIDIA AI Native and Consumer Internet customers on large data center GPU server and networking system deployments as a Solution Architect Engineer. Guide customer discussions on network design, compute/storage and support bring up of server/network/cluster deployments. You will need to visit customer data centers during the bring up phase.
- Demonstrate subject matter expertise in advanced GPU and network systems and be a trusted technical advisor to NVIDIA's strategic customers. Bring customer-specific requirements to product teams to guide product roadmap features.
- Identify new project opportunities for NVIDIA products and technology solutions in data center and artificial intelligence applications. Work closely with GPU/Network Systems Engineering, Product Management and Sales teams.
- Act as customer trusted advisor conducting regular technical customer meetings for product roadmap, cluster issues debug, feature discussions and introductions to new technology solutions.
- Build custom product demonstrations and POCs for solutions that address critical business needs of customers.
- Analyze and debug compute/network configuration and performance issues to deliver performant clusters.
Requirements
- BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, or other Engineering fields or equivalent experience.
- 8+ years of Systems/Solution Engineering (or similar engineering roles) experience preferred.
- System-level expertise of CPU/GPU server architecture, NICs, Linux, system software and kernel drivers.
- Experience with networking switches for Ethernet/InfiniBand, and data center infrastructure (power/cooling).
- Knowledge of DevOps/MLOps technologies such as Docker/containers and Kubernetes.
- Effective time management and the ability to balance multiple tasks.
- Strong verbal and written communication skills; able to share ideas and code clearly via documents and presentations.
Ways to stand out
- External customer-facing background.
- Experience with bring-up and deployment of large clusters.
- Systems engineering, coding, and debugging skills including experience with C/C++, Linux kernel and drivers.
- Hands-on experience with NVIDIA GPU systems/SDKs (e.g. CUDA), NVIDIA Networking technologies (NICs, RoCE, InfiniBand), and/or ARM CPU solutions.
- Familiarity with virtualization technology concepts.
Other details
- Occasional travel (~20%) is required for on-site visits to customers and industry events. The role is open to remote work locations.
- Applications for this job will be accepted at least until November 13, 2025.
- Compensation: The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits.
Company
NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. The company does not discriminate on the basis of protected characteristics and values diversity in employees.