Senior Software Engineer, AI and DL Kernel Libraries

at Nvidia

📍 Santa Clara, United States

USD 184,000-287,500 per year

SENIOR

✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Python @ 7 Machine Learning @ 4 TensorFlow @ 7 LLM @ 4 PyTorch @ 7 CUDA @ 7 GPU @ 4 Deep Learning @ 4 AI @ 4 vLLM @ 4 SGLang @ 4

Details

We are looking for outstanding AI systems engineers to develop groundbreaking technologies in the inference systems software stack. The team builds AI systems software to accelerate inference by developing libraries, code generators, and GPU kernel technologies for NVIDIA's hardware architecture. Work includes designing efficient attention kernel implementations, LLM inference runtime components, kernel code generators, and other technologies to accelerate large language models and high-impact AI workloads.

Responsibilities

Innovate and develop new AI systems technologies for efficient inference
Design, implement, and optimize kernels for high-impact AI workloads
Design and implement extensible abstractions for LLM serving engines
Build efficient just-in-time domain-specific compilers and runtimes
Collaborate closely with engineers across deep learning frameworks, libraries, kernels, and GPU architecture teams
Contribute to open source projects such as FlashInfer, vLLM, and SGLang

Requirements

Master's degree in Computer Science, Electrical Engineering, or related field (or equivalent experience); PhD preferred
6+ years (academic/industry) experience with ML/DL systems development preferred
Strong experience developing or using deep learning frameworks (examples: PyTorch, JAX, TensorFlow, ONNX)
Experience with inference engines and runtimes (examples: vLLM, SGLang, MLC) is desirable
Strong Python and C/C++ programming skills
Strong experience in GPU kernel development and performance optimizations, especially using CUDA C/C++, cuTile, Triton, or similar technologies

Ways to Stand Out

Background in domain-specific compiler and library solutions for LLM inference and training (e.g., FlashInfer, Flash Attention)
Expertise in inference engines like vLLM and SGLang
Expertise in machine learning compilers (e.g., Apache TVM, MLIR)
Open source project ownership or contributions

Compensation & Benefits

Base salary range: 184,000 USD - 287,500 USD (determined based on location, experience, and pay of employees in similar positions)
Eligible for equity and benefits (link to NVIDIA benefits)

Other Information

Applications accepted at least until March 15, 2026
This posting is for an existing vacancy
NVIDIA uses AI tools in its recruiting processes
NVIDIA is an equal opportunity employer and values diversity