Senior DL Algorithms Engineer - Inference Performance

at Nvidia
USD 152,000-287,500 per year
SENIOR
✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences

Algorithms @ 4 Performance Optimization @ 4 LLM @ 4 PyTorch @ 6 CUDA @ 1 GPU @ 4 Deep Learning @ 4 AI @ 4 Profiling @ 4 vLLM @ 4 OpenCL @ 1 SGLang @ 4 HPC @ 6 Performance Analysis @ 4

Details

We are looking for a Senior DL Algorithms Engineer focused on LLM/Omni model inference optimizations. This role is for engineers who perform performance analysis and optimization across the full hardware/software stack — from GPU architecture to deep learning frameworks — to maximize inference performance. You will directly impact hardware and software roadmaps at a fast-growing AI company.

Responsibilities

  • Enable and optimize state-of-the-art open models (examples: Nemotron and Cosmos) on NVIDIA's accelerated inference software stack.
  • Contribute new features, fix bugs, and deliver production code to open-source frameworks such as TRT-LLM, vLLM, SGLang, FlashInfer, etc.
  • Profile and analyze bottlenecks across the full inference stack to push inference performance boundaries.
  • Benchmark state-of-the-art offerings and perform competitive analysis for NVIDIA's software/hardware stack.
  • Co-design with partner teams to develop the next generation of AI models and services.

Requirements

  • PhD in Computer Science, Electrical Engineering, CSEE, or equivalent experience.
  • 3+ years of experience.
  • Strong background in deep learning and neural networks, particularly inference.
  • Experience with performance profiling, analysis, and optimization, especially for GPU-based applications.
  • Proficient in PyTorch or equivalent frameworks for AI, or experience in HPC-heavy application development.
  • Deep understanding of computer architecture and familiarity with GPU architecture fundamentals.

Ways to Stand Out

  • Proven experience with processor and system-level performance optimization.
  • Deep understanding of modern LLM and diffusion model architectures.
  • Strong fundamentals in algorithms.
  • GPU programming experience (CUDA or OpenCL) is a strong plus.

Compensation & Benefits

  • Base salary ranges:
    • Level 3: 152,000 USD – 241,500 USD
    • Level 4: 184,000 USD – 287,500 USD
  • Eligible for equity and benefits (see company benefits page).

Other Information

  • Applications accepted until May 9, 2026. This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.

Equal Opportunity

NVIDIA is committed to fostering a diverse work environment and is proud to be an equal opportunity employer. The company does not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.