Senior Software Engineer, CUTLASS Performance

at Nvidia
USD 152,000-287,500 per year
SENIOR
✅ On-site

Used Tools & Technologies

HPC

Required Skills & Competences

Python @ 4 Performance Optimization @ 4 QA @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4 Deep Learning @ 4 AI @ 4 vLLM @ 4 SGLang @ 4 Performance Analysis @ 4 JAX @ 4

Details

NVIDIA's high-performance computing platforms power AI across many applications. CUTLASS is an open-source ecosystem for high-performance linear algebra and Tensor Core primitives, providing C++ and Python abstractions for custom matrix multiply (GEMM) and related deep learning computations on NVIDIA GPUs.

Responsibilities

  • Benchmark the performance of state-of-the-art deep learning models' inference and training passes to identify key GPU kernel and fusion opportunities.
  • Identify gaps between theoretical and realized performance, and suggest software improvements or model adjustments to resolve them.
  • Develop tooling to automate the benchmarking, analysis, and performance optimization loop to push the limit of CUTLASS kernel performance within DL networks.
  • Serve as the authoritative resource on kernel performance for the team and engage with GPU architecture, deep learning framework, and QA teams across NVIDIA as the CUTLASS performance representative.

Requirements

  • Masters or PhD in Computer Science, Computer Engineering, or related field (or equivalent experience).
  • 3+ years of relevant industry experience.
  • Strong programming skills in Python and C++.
  • Experience in software performance analysis and optimization.
  • Deep understanding of computer architecture and familiarity with GPUs or similar parallel processing architectures.

Preferred / Ways to stand out

  • Deep understanding of state-of-the-art deep learning model architectures.
  • Hands-on experience with performance benchmarking of DL frameworks such as PyTorch, JAX, SGLang, vLLM, TRT-LLM, or others.
  • Experience developing performance models and performance regression systems.

Compensation & Benefits

  • Base salary ranges (determined by location, experience, and peer pay):
    • Level 3: 152,000 USD - 241,500 USD per year
    • Level 4: 184,000 USD - 287,500 USD per year
  • Eligible for equity and additional benefits (link to NVIDIA benefits referenced in posting).

Additional information

  • Applications accepted at least until June 5, 2026. This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.
  • NVIDIA is an equal opportunity employer and committed to diversity and inclusion.