Senior Deep Learning Algorithm Engineer, Training Framework

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 6 GitHub @ 4 Algorithms @ 4 Machine Learning @ 4 Mathematics @ 6 Performance Optimization @ 4 Debugging @ 6 API @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4

Details

NVIDIA's NeMo Framework team is seeking engineers to design, develop, and optimize diverse real-world workloads for large-scale model training. NeMo is an open-source, scalable, cloud-native framework for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. The role focuses on expanding NeMo Framework capabilities across distributed training, model parallelism, performance optimization, APIs, toolkits, and libraries.

Responsibilities

Develop algorithms for AI/Deep Learning, data analytics, machine learning, or scientific computing.
Contribute to and advance the open-source NeMo Framework (https://github.com/nvidia-nemo).
Solve large-scale, end-to-end AI training and inference challenges across the full model lifecycle: orchestration, data preprocessing, model training and tuning, and deployment.
Work at the intersection of computer architecture, libraries, frameworks, AI applications, and the software stack.
Innovate and improve model architectures, distributed training algorithms, and model parallel paradigms.
Perform performance tuning and optimizations, including model training and fine-tuning with mixed precision recipes on next-generation NVIDIA GPU architectures.
Research, prototype, and develop robust and scalable AI tools and pipelines.

Requirements

MS, PhD or equivalent experience in Computer Science, AI, Applied Mathematics, or related field and 5+ years of industry experience.
Experience with AI frameworks such as PyTorch and JAX.
Experience with inference and deployment environments (examples listed: TRTLLM, vLLM, SGLang).
Proficiency in Python programming, software design, debugging, performance analysis, test design, and documentation.
Track record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.
Strong understanding of AI/Deep Learning fundamentals and their practical applications.

Ways to Stand Out (Preferred / Nice to Have)

Hands-on experience in large-scale AI training and deep understanding of compute system concepts (latency/throughput bottlenecks, pipelining, multiprocessing) with demonstrated performance analysis and tuning.
Expertise in distributed computing and model parallelism.
Expertise in mixed precision training.
Prior experience with Generative AI techniques applied to LLMs and multimodal learning (text, image, video).
Knowledge of GPU/CPU architecture and numerical software.
Created or contributed to open-source deep learning frameworks.

Compensation & Benefits

Base salary ranges (determined by location, experience, and comparable employees):
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
You will also be eligible for equity and benefits (see NVIDIA benefits).

Other Details

Location: Santa Clara, CA, United States (see Locations field).
Employment type: Full time.
Application window: Applications accepted at least until October 10, 2025.
#LI-Hybrid

NVIDIA is an equal opportunity employer committed to fostering a diverse work environment and does not discriminate on the basis of protected characteristics.