Product Manager, AI Platform Kernels And Communication Libraries

at Nvidia
USD 144,000-258,800 per year
MIDDLE SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Marketing @ 3 GitHub @ 3 Algorithms @ 3 Communication @ 3 Product Management @ 5 CUDA @ 3 GPU @ 3

Details

NVIDIA's AI Software Platforms team seeks a technical product manager to accelerate next-generation inference deployments through innovative libraries, communication runtimes, and kernel optimization frameworks. This role bridges low-level GPU programming with ecosystem-wide developer enablement for products including CUTLASS, cuDNN, NCCL, NVSHMEM, and open-source contributions to Triton and FlashInfer.

Responsibilities

  • Architect developer-focused products that simplify high-performance inference and training deployment across diverse GPU architectures.
  • Define the multi-year strategy for kernel and communication libraries by analyzing performance bottlenecks in emerging AI workloads.
  • Collaborate with CUDA kernel engineers to design intuitive, high-level abstractions for memory and distributed execution.
  • Partner with open-source communities like Triton and FlashInfer to shape and drive ecosystem-wide roadmaps.
  • Work with developers inside and outside the company to identify key improvements, create roadmaps, and stay alert on the inference landscape.
  • Partner with NVIDIA leaders to define clear product strategy and with marketing teams to build go-to-market plans.

Requirements

  • 5+ years of technical product management experience shipping developer products for GPU acceleration, with expertise in HPC optimization stacks.
  • Expert-level understanding of CUDA execution models and multi-GPU protocols, with a proven track record of translating hardware capabilities into software roadmaps.
  • BS or MS or equivalent experience in Computer Engineering or demonstrated expertise in parallel computing architectures.
  • Strong technical interpersonal skills with experience communicating complex optimizations to developers and researchers.

Ways to stand out

  • PhD or equivalent experience in Computer Engineering or a related technical field.
  • Contributions to performance-critical open-source projects like Triton, FlashAttention, or TVM with measurable adoption impact.
  • Experience crafting GitHub-first developer tools with significant community engagement (e.g., >1k stars).
  • Published research on GPU kernel optimization, collective communication algorithms, or ML model serving architectures.
  • Experience building cost-per-inference models incorporating hardware utilization, energy efficiency, and cluster scaling factors.

Compensation & Benefits

  • Base salary ranges provided by level:
    • Level 3: 144,000 USD - 218,500 USD
    • Level 4: 168,000 USD - 258,750 USD
  • You will also be eligible for equity and benefits. (See NVIDIA benefits information.)

Other details

  • Location: Santa Clara, CA, United States
  • Employment type: Full time
  • Applications for this job will be accepted at least until July 29, 2025.
  • NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.