DL Performance Software Engineer - LLM Inference

at Nvidia
USD 120,000-235,800 per year
MIDDLE SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 3 GitHub @ 3 Distributed Systems @ 3 Machine Learning @ 3 Parallel Programming @ 3 Debugging @ 3 LLM @ 3 PyTorch @ 3 CUDA @ 3 GPU @ 3

Details

At NVIDIA we believe artificial intelligence (AI) will fundamentally transform how people live and work. The LLM Inference team builds software to make large language model (LLM) inference more efficient, scalable, and accessible. The team collaborates across resource orchestration, distributed systems, inference engine optimization, and high-performance GPU kernel development to deliver production-quality inference stacks.

Responsibilities

  • Write safe, scalable, modular, high-quality backend software in C++ and Python for LLM inference.
  • Perform benchmarking, profiling, and system-level programming for GPU applications.
  • Contribute design documents, provide code reviews, and create tutorials to facilitate team collaboration.
  • Implement and run unit tests and performance tests across stages of the inference pipeline.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, a relevant technical field, or equivalent experience.
  • Strong coding skills in Python and C/C++.
  • 2+ years of industry software engineering experience or equivalent research experience.
  • Knowledge and passion for machine learning and performance engineering.
  • Proven project experience building software where performance is a core requirement.

Ways to stand out

  • Solid fundamentals in machine learning, deep learning, operating systems, computer architecture, and parallel programming.
  • Research experience in systems or machine learning.
  • Project experience with modern deep learning software such as PyTorch, CUDA, vLLM, SGLang, and TensorRT-LLM.
  • Experience with performance modeling, profiling, debugging, code optimization, and architectural knowledge of CPU and GPU.

Compensation

  • Base salary ranges provided by location and level:
    • Level 2: 120,000 USD - 189,750 USD
    • Level 3: 148,000 USD - 235,750 USD
  • You will also be eligible for equity and benefits.

Other details

  • Employment type: Full time.
  • Applications accepted at least until September 28, 2025.
  • NVIDIA is an equal opportunity employer and fosters a diverse work environment. Applicants are encouraged to include sample projects (e.g., GitHub) that demonstrate the qualifications above.