Product Manager, AI Platform Kernels and Communication Libraries
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
GitHub @ 6 Algorithms @ 3 Communication @ 3 Product Management @ 5 CUDA @ 3 GPU @ 3Details
NVIDIA's AI Software Platforms team seeks a technical product manager to accelerate next-generation inference deployments through innovative libraries, communication runtimes, and kernel optimization frameworks. This role bridges low-level GPU programming with ecosystem-wide developer enablement for products including CUTLASS, cuDNN, NCCL, NVSHMEM, and open-source contributions to Triton/FlashInfer.
Responsibilities
- Architect developer-focused products that simplify high-performance inference and training deployment across diverse GPU architectures.
- Define the multi-year strategy for kernel and communication libraries by analyzing performance bottlenecks in emerging AI workloads.
- Collaborate with CUDA kernel engineers to design intuitive, high-level abstractions for memory and distributed execution.
- Partner with open-source communities like Triton and FlashInfer to shape and drive ecosystem-wide roadmaps.
- Work with internal developers and external ecosystem users to identify key improvements, create roadmaps, and drive product strategy and go-to-market plans.
Requirements
- 5+ years of technical product management experience shipping developer products for GPU acceleration, with expertise in HPC optimization stacks.
- Expert-level understanding of CUDA execution models and multi-GPU protocols, with a proven track record of translating hardware capabilities into software roadmaps.
- BS or MS (or equivalent experience) in Computer Engineering or demonstrated expertise in parallel computing architectures.
- Strong technical interpersonal skills with experience communicating complex optimizations to developers and researchers.
Ways to stand out
- PhD or equivalent experience in Computer Engineering or a related technical field.
- Contributions to performance-critical open-source projects like Triton, FlashAttention, or TVM with measurable adoption impact.
- Experience crafting GitHub-first developer tools with strong community engagement.
- Published research on GPU kernel optimization, collective communication algorithms, or ML model serving architectures.
- Experience building cost-per-inference models incorporating hardware utilization, energy efficiency, and cluster scaling factors.
Compensation & Logistics
- Base salary ranges provided by level:
- Level 3: 144,000 USD - 218,500 USD
- Level 4: 168,000 USD - 258,750 USD
- You will also be eligible for equity and benefits. Applications accepted at least until July 29, 2025.
Location
- Santa Clara, California, United States (on-site)
Equal Opportunity
NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. NVIDIA does not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.