Senior Deep Learning Software Engineer, Inference

at Nvidia
USD 148,000-287,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Software Development @ 6 Python @ 1 Performance Optimization @ 4 Debugging @ 1 LLM @ 4 PyTorch @ 4 Agile @ 1 CUDA @ 4 GPU @ 4

Details

NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and optimize the GPU-accelerated software that powers today’s most sophisticated AI applications. The team develops and maintains high-performance open-source frameworks for efficient large-scale model serving and inference, focusing on deployment and serving of large language models and generative AI models across NVIDIA accelerators.

Responsibilities

  • Performance optimization, analysis, and tuning of deep learning models across domains such as LLM, multimodal, and generative AI.
  • Scale performance of deep learning models across different architectures and types of NVIDIA accelerators (datacenter GPUs to edge SoCs).
  • Contribute features and code to NVIDIA’s inference libraries and open-source projects (vLLM, SGLang, FlashInfer and LLM software solutions).
  • Collaborate with cross-functional teams across frameworks, NVIDIA libraries, and inference optimization groups.
  • Use and integrate open-source tools and plugins, including CUTLASS, (OAI) Triton, NCCL, and CUDA kernels, to implement and optimize model serving pipelines.

Requirements

  • Master’s or PhD (or equivalent experience) in a relevant field (Computer Engineering, Computer Science, EECS, AI).
  • 5+ years of relevant software development experience.
  • Excellent C/C++ programming and software design skills.
  • Agile software development experience is helpful; Python experience is a plus.
  • Prior experience with training, deploying, or optimizing DL model inference in production is a plus.
  • Experience or background in performance modeling, profiling, debugging, and code optimization, and architectural knowledge of CPU and GPU is a plus.
  • GPU programming experience (CUDA, Triton or CUTLASS) is a plus.

Ways to Stand Out

  • Contributions to deep learning software projects (PyTorch, vLLM, SGLang).
  • Experience with multi-GPU communications (NCCL, NVSHMEM).

Compensation & Benefits

Your base salary will be determined based on your location, experience, and pay of employees in similar positions. The base salary ranges provided are:

  • Level 3: 148,000 USD - 235,750 USD
  • Level 4: 184,000 USD - 287,500 USD

You will also be eligible for equity and benefits. NVIDIA offers a comprehensive benefits package.

Additional Details

  • Applications for this job will be accepted at least until September 12, 2025.
  • NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.