Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Python @ 3
Communication @ 6
Debugging @ 3
LLM @ 6
GPU @ 3
Deep Learning @ 6
AI @ 6
Performance Analysis @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as "the AI computing company." This role is part of NVIDIA's compiler organization and focuses on pushing the boundaries of AI performance for next-generation GPUs.
Responsibilities
- Drive technical innovation through hands-on development focusing on kernel generation and computational graph optimizations for next-generation NVIDIA GPUs.
- Advance the state-of-the-art by solving complex compilation problems for AI workloads (inference and training) and transition breakthroughs into enterprise and consumer products.
- Collaborate on hardware/software co-design with experts across software, hardware, and research divisions to architect and co-design future silicon.
- Scale AI to the datacenter by participating in the advancement and optimization of datacenter-scale AI workload deployments.
Requirements
- BS or MS in Computer Science, Computer Engineering, or a related field (or equivalent experience). A PhD is strongly preferred.
- Compiler experience: 3+ years of relevant industry experience specializing in compiler optimizations, synthesis, and placement.
- MLIR knowledge: demonstrated, hands-on experience working with MLIR.
- Programming excellence: exceptional C/C++ and Python programming and software design skills, including rigorous debugging, performance analysis, and test design.
- Strong communication and interpersonal skills; ability to collaborate effectively in a dynamic, fast-paced, product-oriented environment.
Ways to Stand Out (Preferred / Nice-to-have)
- Hands-on experience implementing complex AI workloads on CPU, GPU, and/or custom AI accelerator architectures.
- Deep understanding of Large Language Model (LLM) inference and implications on computer architecture.
- Demonstrated experience designing and architecting comprehensive compiler frameworks from the ground up.
Benefits and Compensation
- Competitive base salary range: 152,000 USD - 241,500 USD (base salary will be determined based on location, experience, and pay of employees in similar positions).
- Eligible for equity and benefits (see https://www.nvidia.com/en-us/benefits/).
Additional Information
- Applications accepted at least until April 5, 2026. This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.