Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Python @ 4
Algorithms @ 4
Hiring @ 4
Mentoring @ 4
Debugging @ 4
API @ 4
PyTorch @ 4
CUDA @ 6
GPU @ 4
Deep Learning @ 4
AI @ 4
Robotics @ 4
OpenCL @ 6
Performance Analysis @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA is hiring software engineers for its Deep Learning & AI Compiler (DLC) team. The DLC is the backbone of NVIDIA's inference engine across data centers, personal devices, automotive, and robotics. The compiler must deliver leading inference performance, fast build time, reduced memory footprints, and ease of use in both Ahead-of-Time and Just-in-Time modes.
Responsibilities
- Analyze deep learning networks and develop compiler optimization algorithms.
- Collaborate with deep learning software framework teams and GPU architecture teams to accelerate next-generation deep learning software.
- Define public APIs, implement performance optimizations and analysis, and craft compiler techniques for AI workloads and future NVIDIA GPUs.
Requirements
- Bachelor's, Master's or Ph.D. in Computer Science, Computer Engineering, related field or equivalent experience.
- 3+ years of relevant work or research experience in performance analysis and compiler optimizations.
- Experience with compiler technologies (examples given: MLIR, LLVM, XLA, Triton).
- Excellent C/C++ and Python programming and software design skills, including debugging, performance analysis, and test design.
- Ability to work independently, define project goals and scope, and lead development efforts.
- Strong interpersonal skills and ability to work in a dynamic product-oriented team.
Ways to stand out from the crowd
- Proficiency in CPU and/or GPU architecture; CUDA or OpenCL programming experience.
- Understanding of deep learning models, algorithms and frameworks (such as PyTorch, JAX).
- GPU kernel authoring and performance analysis using tools such as Nsight Compute.
- Experience mentoring early-career engineers and interns.
- Track record on new hardware bring-up.
Compensation and Benefits
- Base salary range: 152,000 USD - 241,500 USD (determined based on location, experience, and pay of employees in similar positions).
- Eligible for equity and benefits (link to NVIDIA benefits provided in posting).
Applications for this job will be accepted at least until February 28, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.