Senior Deep Learning Software Engineer, Inference

at Nvidia

📍 Santa Clara, United States

USD 148,000-287,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Software Development @ 6 Python @ 1 Performance Optimization @ 4 Debugging @ 1 LLM @ 4 PyTorch @ 4 Agile @ 1 CUDA @ 4 GPU @ 4

Details

NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and optimize the GPU-accelerated software that powers today’s most sophisticated AI applications. The team develops and maintains high-performance open-source frameworks for efficient large-scale model serving and inference, focusing on deployment and serving of large language models and generative AI models across NVIDIA accelerators.

Responsibilities

Performance optimization, analysis, and tuning of deep learning models across domains such as LLM, multimodal, and generative AI.
Scale performance of deep learning models across different architectures and types of NVIDIA accelerators (datacenter GPUs to edge SoCs).
Contribute features and code to NVIDIA’s inference libraries and open-source projects (vLLM, SGLang, FlashInfer and LLM software solutions).
Collaborate with cross-functional teams across frameworks, NVIDIA libraries, and inference optimization groups.
Use and integrate open-source tools and plugins, including CUTLASS, (OAI) Triton, NCCL, and CUDA kernels, to implement and optimize model serving pipelines.

Requirements

Master’s or PhD (or equivalent experience) in a relevant field (Computer Engineering, Computer Science, EECS, AI).
5+ years of relevant software development experience.
Excellent C/C++ programming and software design skills.
Agile software development experience is helpful; Python experience is a plus.
Prior experience with training, deploying, or optimizing DL model inference in production is a plus.
Experience or background in performance modeling, profiling, debugging, and code optimization, and architectural knowledge of CPU and GPU is a plus.
GPU programming experience (CUDA, Triton or CUTLASS) is a plus.

Ways to Stand Out

Contributions to deep learning software projects (PyTorch, vLLM, SGLang).
Experience with multi-GPU communications (NCCL, NVSHMEM).

Compensation & Benefits

Your base salary will be determined based on your location, experience, and pay of employees in similar positions. The base salary ranges provided are:

Level 3: 148,000 USD - 235,750 USD
Level 4: 184,000 USD - 287,500 USD

You will also be eligible for equity and benefits. NVIDIA offers a comprehensive benefits package.

Additional Details

Applications for this job will be accepted at least until September 12, 2025.
NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.