Senior Software Engineer, Machine Learning Inference

at Nvidia
USD 152,000-287,500 per year
SENIOR
✅ Hybrid

Used Tools & Technologies

GenAI

Required Skills & Competences

Software Development @ 6 Python @ 4 Machine Learning @ 4 Communication @ 7 Rust @ 7 LLM @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 6 Deep Learning @ 4 Generative AI @ 4 AI @ 4 vLLM @ 4 OpenCL @ 4 TensorRT @ 4 SGLang @ 4 Performance Analysis @ 4 JAX @ 4

Details

At NVIDIA, we're at the forefront of innovation, driving advancements in AI and machine learning to solve some of the world’s most challenging problems. The TensorRT team develops industry-leading deep learning inference software for NVIDIA AI accelerators. As a Senior Software Engineer in the TensorRT team, you will design and implement inference software optimizations to power AI applications on NVIDIA GPUs.

Responsibilities

  • Design, develop and optimize NVIDIA TensorRT and TensorRT-LLM to accelerate inference applications for datacenter, workstations, and PCs.
  • Develop software in C++, Python, and CUDA for seamless and efficient deployment of state-of-the-art LLMs and Generative AI models.
  • Collaborate with deep learning experts and GPU architects across the company to influence hardware and software design for inference.

Requirements

  • BS, MS, PhD or equivalent experience in Computer Science, Computer Engineering or a related field.
  • 4+ years of software development experience on a large codebase or project.
  • Strong proficiency in C++ (required); experience with Rust or Python is also noted.
  • Experience in developing deep learning frameworks, compilers, or system software.
  • Excellent problem-solving skills and the ability to learn and work effectively in a fast-paced, collaborative environment.
  • Strong communication skills and the ability to articulate complex technical concepts.

Ways to stand out

  • Experience in developing inference backends and compilers for GPUs.
  • Knowledge of machine learning techniques and GPU programming with CUDA or OpenCL.
  • Background working with LLM inference frameworks like TensorRT-LLM, vLLM, SGLang.
  • Experience with deep learning frameworks such as TensorRT, PyTorch, JAX.
  • Knowledge of close-to-metal performance analysis, optimization techniques, and tools.

Compensation

  • Base salary range (Level 3): 152,000 USD - 241,500 USD per year.
  • Base salary range (Level 4): 184,000 USD - 287,500 USD per year.
  • You will also be eligible for equity and benefits.

Additional information

  • #LI-Hybrid
  • Applications for this job will be accepted at least until April 14, 2026.
  • This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.
  • NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.