Deep Learning Software Engineer, FlashInfer - New College Grad 2025
at Nvidia
USD 104,000-172,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 6 Machine Learning @ 3 TensorFlow @ 6 LLM @ 3 PyTorch @ 6 CUDA @ 6 GPU @ 3Details
NVIDIA is seeking Deep Learning Software Engineers to develop AI inference systems software that accelerates inference for large language models and other AI workloads. The team builds libraries, code generators, GPU kernel technologies, and inference runtimes (for example FlashInfer, vLLM, SGLang) to optimize LLM serving and high-impact AI workloads. This role involves designing abstractions, implementing efficient kernels, building JIT domain-specific compilers and runtimes, and collaborating with framework, libraries, and GPU architecture teams.
Responsibilities
- Innovate and develop new AI systems technologies for efficient inference
- Design, implement, and optimize kernels for high-impact AI workloads
- Design and implement extensible abstractions for LLM serving engines
- Build efficient just-in-time domain-specific compilers and runtimes
- Collaborate closely with other engineers across deep learning frameworks, libraries, kernels, and GPU architecture teams
- Contribute to open source communities and projects such as FlashInfer, vLLM, and SGLang
Requirements
- Bachelor's degree in Computer Science, Electrical Engineering, or a related field (or equivalent experience); PhD preferred
- Strong experience in developing or using deep learning frameworks (examples: PyTorch, JAX, TensorFlow, ONNX)
- Ideally experience with inference engines and runtimes such as vLLM, SGLang, and MLC
- Strong Python and C/C++ programming skills
Ways to stand out
- Background in domain-specific compiler and library solutions for LLM inference and training (e.g., FlashInfer, Flash Attention)
- Expertise in inference engines like vLLM and SGLang
- Expertise in machine learning compilers (e.g., Apache TVM, MLIR)
- Strong experience in GPU kernel development and performance optimizations (especially using CUDA C/C++, cuTile, Triton, or similar)
- Open-source project ownership or contributions
Compensation & Benefits
- Base salary range: 104,000 USD - 172,500 USD (final base salary determined based on location, experience, and internal pay equity)
- Eligible for equity and benefits (see NVIDIA benefits)
Other details
- Applications for this job will be accepted at least until August 22, 2025.
- NVIDIA is an equal opportunity employer and values diversity in its workforce.