Solutions Architect, Generative AI Inference And Deployment

at Nvidia

📍 Santa Clara, United States

USD 148,000-235,800 per year

MIDDLE

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Kubernetes @ 3 Python @ 6 MLOps @ 3 TensorFlow @ 5 Communication @ 3 Mathematics @ 3 Parallel Programming @ 2 Debugging @ 5 LLM @ 6 PyTorch @ 5 GPU @ 5

Details

NVIDIA is seeking outstanding AI Solutions Architects to assist and support customers building solutions with the newest AI technology. Solutions architects work across different teams, helping customers with Accelerated Computing and Deep Learning software and hardware platforms.

Responsibilities

Partner with other solution architects, engineering, product, and business teams to understand strategies and define high-value solutions.
Engage dynamically with developers, researchers, and data scientists across various technical areas.
Partner strategically with lighthouse customers and industry solution partners targeting NVIDIA's computing platform.
Help customers adopt and build creative solutions using NVIDIA technology and MLOps solutions.
Analyze performance and power efficiency of AI inference workloads on Kubernetes.
Some travel to conferences and customers may be required.

Requirements

BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, Engineering or related fields (or equivalent experience).
5+ years hands-on experience with Deep Learning frameworks like PyTorch and TensorFlow.
Strong fundamentals in programming, optimizations, and software design, especially in Python.
Proficient in problem-solving and debugging GPU orchestration and Multi-Instance GPU (MIG) management within Kubernetes environments.
Experience with containerization, orchestration technologies, monitoring, and observability solutions for AI deployments.
Strong knowledge of LLM and Deep Learning inference theory and practice.
Excellent presentation, communication, and collaboration skills.

Ways To Stand Out From The Crowd

Prior experience with DL training at scale or deploying/optimizing DL inference in production.
Experience with NVIDIA GPUs and software libraries such as NVIDIA NIM, Dynamo, TensorRT, TensorRT-LLM.
Excellent C/C++ programming skills including debugging, profiling, optimization, and performance analysis.
Familiarity with parallel programming and distributed computing platforms.

Compensation and Benefits

Base salary range: 148,000 USD - 235,750 USD, determined by location, experience, and pay of similar roles.
Eligible for equity and additional benefits offered by NVIDIA.

NVIDIA is committed to a diverse work environment and is an equal opportunity employer.