Senior Deep Learning Software Engineer, FlashInfer

at Nvidia
USD 184,000-287,500 per year
SENIOR
βœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 7 Machine Learning @ 4 TensorFlow @ 3 LLM @ 4 PyTorch @ 3 CUDA @ 7 GPU @ 4

Details

We are looking for a Senior Deep Learning Software Engineer to join the FlashInfer team and develop breakthrough AI inference systems software. The role focuses on building libraries, code generators, GPU kernel technologies, and runtimes to accelerate large language models (LLMs), agents, and other high-impact AI workloads on NVIDIA hardware.

Responsibilities

  • Innovate and develop new AI systems technologies for efficient inference.
  • Design, implement, and optimize GPU kernels for high-impact AI workloads.
  • Design and implement extensible abstractions for LLM serving engines.
  • Build efficient just-in-time (JIT) domain-specific compilers and runtimes.
  • Collaborate closely with engineers across deep learning frameworks, libraries, kernels, and GPU architecture teams.
  • Contribute to and engage with open source communities such as FlashInfer, vLLM, and SGLang.

Requirements

  • Master's degree in Computer Science, Electrical Engineering, or a related field (or equivalent experience); PhD preferred.
  • 6+ years (academic/industry) experience with ML/DL systems development is preferable.
  • Strong experience developing or using deep learning frameworks (e.g., PyTorch, JAX, TensorFlow, ONNX) and familiarity with inference engines and runtimes (e.g., vLLM, SGLang, MLC).
  • Strong programming skills in Python and C/C++.

Preferred / Ways to stand out

  • Background in domain-specific compiler and library solutions for LLM inference and training (e.g., FlashInfer, Flash Attention).
  • Expertise in inference engines such as vLLM and SGLang.
  • Experience with machine learning compilers (e.g., Apache TVM, MLIR).
  • Strong experience in GPU kernel development and performance optimizations, especially using CUDA C/C++, cuTile, Triton, or similar technologies.
  • Open source project ownership or contributions.

Benefits / Compensation

  • Base salary range: 184,000 USD - 287,500 USD (determined based on location, experience, and comparable roles).
  • Eligible for equity and company benefits (see NVIDIA benefits page).

Additional information

  • Applications for this job will be accepted at least until August 5, 2025.
  • NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.