Product Manager - Inference

at Nvidia

📍 Santa Clara, United States

USD 144,000-258,800 per year

MIDDLE

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Marketing @ 3 Software Development @ 3 GitHub @ 3 Algorithms @ 3 Machine Learning @ 3 Hiring @ 3 Leadership @ 3 Communication @ 6 Performance Optimization @ 3 Product Management @ 6 LLM @ 3 GPU @ 3

Details

Inference is the fastest growing and most competitive area in Generative AI today. It is where AI models impact our daily life, and where every bit of accuracy and performance matters for quality, safety, and cost. Inference is also constantly evolving, with new acceleration algorithms, use cases, and deployment techniques. As a Product Manager for AI Platform Inference you will be responsible for building the tools, SDKs, and libraries which enable developers' inference deployments to thrive on NVIDIA GPUs.

As NVIDIA Product Managers, our goal is to enable developers to be successful on the NVIDIA Platform, and push the boundaries of what is possible in AI deployments. As Product Managers, we are the champions inside NVIDIA for developers looking to accelerate their deployments on GPUs. We work directly with developers inside and outside of the company to identify key improvements, create roadmaps, and stay alert on the inference landscape. We also work with NVIDIA leaders to define clear product strategy, and with marketing teams to build go-to-market plans. The Product Management organization at NVIDIA is a small, strong, and impactful group. We focus on enabling deep learning across all GPU use cases and providing great solutions for developers. We are seeking a rare blend of product skills, technical depth, and passion to make NVIDIA great for developers.

Responsibilities

Create products to help developers build better inference deployments
Develop product strategy, roadmaps, and go-to-market plans
Collaborate with internal and external developers to build product-based roadmaps for model optimization software
Work with leadership to align with and drive company strategy

Requirements

Experience with inference deployment and optimization software (examples listed by the team: vLLM, SGLang, FlashInfer, TensorRT-LLM, Triton, Dynamo, TorchAO, etc.)
Demonstrable knowledge of Generative AI or machine learning concepts, particularly around performance optimization, and software development and delivery
BS or MS degree in Computer Science, Computer Engineering, or similar experience (or equivalent experience)
5+ years of technical product management, or similar, experience at a technology company
Strong communication and interpersonal skills

Ways to stand out

Experience leading optimization products for inference
Working on open source & GitHub-first developer products with deep customer interactions
Knowledge of GPU architecture, HW/SW co-design, and performance profiling

Compensation & Benefits

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 144,000 USD - 218,500 USD for Level 3, and 168,000 USD - 258,750 USD for Level 4. You will also be eligible for equity and benefits (see NVIDIA benefits page).

Application deadline

Applications for this job will be accepted at least until July 29, 2025.

Equal opportunity statement

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. We do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.