Senior Deep Learning Software Engineer, Inference

at Nvidia
USD 148,000-287,500 per year
SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Software Development @ 6 Python @ 1 Performance Optimization @ 4 Debugging @ 4 LLM @ 4 PyTorch @ 4 Agile @ 3 CUDA @ 4 GPU @ 4

Details

NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference to design, build, and optimize GPU-accelerated software that powers advanced AI applications. The team develops and maintains high-performance deep learning frameworks including SGLang and vLLM for efficient large-scale model serving and inference, enabling deployment and serving of state-of-the-art language models across NVIDIA accelerators.

Responsibilities

  • Performance optimization, analysis, and tuning of deep learning models in domains such as LLM, multimodal, and generative AI.
  • Scale performance of deep learning models across different architectures and NVIDIA accelerators (datacenter GPUs to edge SoCs).
  • Contribute features and code to NVIDIA inference libraries and projects (vLLM, SGLang, FlashInfer, LLM software solutions).
  • Work cross-functionally with teams across frameworks, NVIDIA libraries, and inference optimization groups to build innovative solutions.
  • Implement and optimize model serving pipelines using open-source tools and plugins (CUTLASS, Triton, NCCL, CUDA kernels, etc.).

Requirements

  • Master's or PhD, or equivalent experience in Computer Engineering, Computer Science, EECS, AI, or a related field.
  • 5+ years of relevant software development experience.
  • Excellent C/C++ programming and software design skills.
  • Experience with training, deploying, or optimizing inference of deep learning models in production is a plus.
  • Background in performance modeling, profiling, debugging, code optimization, or CPU/GPU architecture knowledge is a plus.
  • Python experience is a plus. Familiarity with SW Agile practices is helpful.

Ways to stand out

  • Contributions to deep learning software projects such as PyTorch, vLLM, or SGLang.
  • Experience with multi-GPU communications (NCCL, NVSHMEM).
  • Experience building and shipping products to enterprise customers.
  • GPU programming experience (CUDA, Triton, CUTLASS).

Compensation & Other Details

  • Base salary ranges: Level 3: 148,000 USD - 235,750 USD; Level 4: 184,000 USD - 287,500 USD. Exact base salary will be determined based on location, experience, and pay of employees in similar positions.
  • Eligible for equity and benefits (see NVIDIA benefits page).
  • Applications for this job will be accepted at least until September 7, 2025.

Company

NVIDIA is focused on breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. The company is an equal opportunity employer committed to fostering a diverse work environment.

#LI-Hybrid