Senior DL Compiler Engineer - CUDA Tile
at Nvidia
π Santa Clara, United States
USD 152,000-241,500 per year
Used Tools & Technologies
LLM GenAIRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 β basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 β daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 β you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 β exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Algorithms @ 4
Hiring @ 4
Performance Optimization @ 4
Debugging @ 4
API @ 4
CUDA @ 4
GPU @ 4
Deep Learning @ 4
Generative AI @ 4
AI @ 4
Computer Vision @ 4
OpenCL @ 4
Performance Analysis @ 6
- 1-2 β basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 β daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 β you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 β exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA is hiring software engineers for the CUDA Tile team. NVIDIA GPUs power modern AI and deep learning workloads across generative AI, large language models, recommendation systems, speech recognition, and computer vision. The CUDA Tile team works on a tile-based programming model for NVIDIA GPUs (CUDA Tile shipped with CUDA 13.1).
Responsibilities
- Design and implement compiler transformations and optimization techniques for tile-based kernels.
- Develop MLIR-based dialects and lowering passes.
- Optimize performance of tile-based kernels to run efficiently across multiple generations of NVIDIA GPU architectures.
- Define public APIs and implement related compiler functionality.
- General software engineering work including performance optimization, debugging, and test design.
Requirements
- Bachelors, Masters, or Ph.D. in Computer Science, Computer Engineering, or a related field (or equivalent experience).
- 3+ years of relevant work or research experience in compiler optimization, performance analysis, and IR design.
- Excellent C and C++ programming and software design skills, including debugging, performance analysis, and test design.
- Ability to work independently, define project goals and scope, and lead your own development effort.
- Strong interpersonal skills and ability to work in a dynamic product-oriented team.
Ways to Stand Out
- Knowledge of CPU and/or GPU architecture; CUDA or OpenCL programming experience.
- Experience with MLIR, LLVM, XLA, TVM, and deep learning models and algorithms.
Compensation and Benefits
- Base salary range: 152,000 USD - 241,500 USD (base salary will be determined based on location, experience, and pay of employees in similar positions).
- Eligible for equity and benefits (see NVIDIA benefits page).
Other Information
- Applications for this job will be accepted at least until March 15, 2026.
- This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.