Senior Software Engineer, AI and DL Kernel Libraries

at Nvidia
USD 184,000-287,500 per year
SENIOR
āœ… On-site

Used Tools & Technologies

Not specified

Required Skills & Competences

Python @ 7 Machine Learning @ 4 TensorFlow @ 7 LLM @ 4 PyTorch @ 7 CUDA @ 7 GPU @ 4 Deep Learning @ 4 AI @ 4 vLLM @ 4 SGLang @ 4

Details

We are looking for outstanding AI systems engineers to develop groundbreaking technologies in the inference systems software stack. The team builds AI systems software to accelerate inference by developing libraries, code generators, and GPU kernel technologies for NVIDIA's hardware architecture. Work includes designing efficient attention kernel implementations, LLM inference runtime components, kernel code generators, and other technologies to accelerate large language models and high-impact AI workloads.

Responsibilities

  • Innovate and develop new AI systems technologies for efficient inference
  • Design, implement, and optimize kernels for high-impact AI workloads
  • Design and implement extensible abstractions for LLM serving engines
  • Build efficient just-in-time domain-specific compilers and runtimes
  • Collaborate closely with engineers across deep learning frameworks, libraries, kernels, and GPU architecture teams
  • Contribute to open source projects such as FlashInfer, vLLM, and SGLang

Requirements

  • Master's degree in Computer Science, Electrical Engineering, or related field (or equivalent experience); PhD preferred
  • 6+ years (academic/industry) experience with ML/DL systems development preferred
  • Strong experience developing or using deep learning frameworks (examples: PyTorch, JAX, TensorFlow, ONNX)
  • Experience with inference engines and runtimes (examples: vLLM, SGLang, MLC) is desirable
  • Strong Python and C/C++ programming skills
  • Strong experience in GPU kernel development and performance optimizations, especially using CUDA C/C++, cuTile, Triton, or similar technologies

Ways to Stand Out

  • Background in domain-specific compiler and library solutions for LLM inference and training (e.g., FlashInfer, Flash Attention)
  • Expertise in inference engines like vLLM and SGLang
  • Expertise in machine learning compilers (e.g., Apache TVM, MLIR)
  • Open source project ownership or contributions

Compensation & Benefits

  • Base salary range: 184,000 USD - 287,500 USD (determined based on location, experience, and pay of employees in similar positions)
  • Eligible for equity and benefits (link to NVIDIA benefits)

Other Information

  • Applications accepted at least until March 15, 2026
  • This posting is for an existing vacancy
  • NVIDIA uses AI tools in its recruiting processes
  • NVIDIA is an equal opportunity employer and values diversity