Senior GPU Kernel Performance Lead
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 4 Debugging @ 7 CUDA @ 4 GPU @ 4Details
We are looking for a Senior GPU Kernel Performance Lead to analyze and report on GPU kernel performance. The team delivers high-performance GPU math kernels to NVIDIA libraries such as cuDNN, cuBLAS, and TensorRT to accelerate deep learning models. The team works on achieving peak performance and energy efficiency on current and future-generation GPUs. This position focuses on leading a team validating the performance of kernels; there will also be opportunity for hands-on development. The CUTLASS open-source project is provided as an example of the kind of performant CUDA matrix-multiply code produced by the team.
Responsibilities
- Specify test cases, derived from Deep Learning workloads, to provide adequate directed and use-case coverage across all kernels on both simulation and silicon targets.
- Determine performance theory through the development and use of analytical models.
- Track and report on kernel performance throughout the development lifecycle by using and expanding upon current infrastructure.
- Provide feedback to kernel developers by identifying performance regressions and opportunities to reach achievable peak performance.
Requirements
- PhD degree in Computer Science, Computer Engineering, Applied Math, or related field (or equivalent experience) with 8+ years of relevant industry experience.
- Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test design.
- Experience leading or managing a team related to the performance of CPUs, GPUs, or other deep-learning accelerators.
Ways to stand out
- Experience with analytical models and cycle-accurate hardware simulators.
- Knowledgeable about performance tools like Nsight or VTune.
- Programming experience beyond C++ including assembly, MLIR/LLVM, Python, and CUDA/OpenCL.
Compensation & Benefits
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 5, and 272,000 USD - 425,500 USD for Level 6. You will also be eligible for equity and benefits.
Additional information
- The team works on accelerating deep learning domains such as image classification, speech recognition, natural language processing, and large language models, and on features like Tensor Cores.
- Applications for this job will be accepted at least until July 29, 2025.
- NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment; they do not discriminate on the basis of protected characteristics.