Senior Deep Learning Algorithm Engineer

at Nvidia

📍 Santa Clara, United States

USD 224,000-356,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 6 Algorithms @ 4 Hiring @ 4 Debugging @ 6 API @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4

Details

NVIDIA is hiring engineers for the core AI Frameworks (Megatron Core and NeMo Framework) team to design, develop, and optimize diverse real-world workloads. Megatron Core and NeMo Framework are open-source, scalable and cloud-native frameworks built for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. These GenAI frameworks provide end-to-end model training, including pretraining, reasoning, alignment, customization, evaluation, deployment and tooling to optimize performance and user experience.

Applications for this job will be accepted at least until August 19, 2025.

Responsibilities

Expand Megatron Core and NeMo Framework capabilities to enable users to develop, train, and optimize models.
Design and implement distributed training algorithms, model parallel paradigms, and model optimizations.
Define robust APIs and expand toolkits and libraries for coherence and coverage.
Analyze and tune performance across the software stack, including performance tuning with mixed precision recipes on next-gen NVIDIA GPU architectures.
Solve large-scale, end-to-end AI training and inference challenges across the full model lifecycle (orchestration, data pre-processing, training, tuning, deployment).
Work at the intersection of computer architecture, libraries, frameworks, AI applications, and the full software stack.
Research, prototype, and develop robust and scalable AI tools and pipelines.
Contribute and advance open-source projects (Megatron Core, NeMo Framework).

Requirements

MS, PhD, or equivalent experience in Computer Science, AI, Applied Math, or related fields and 10+ years of industry experience.
Experience with AI frameworks such as PyTorch and JAX, and/or inference and deployment environments (examples mentioned: TRTLLM, vLLM, SGLang).
Proficiency in Python programming, software design, debugging, performance analysis, test design, and documentation.
Strong understanding of AI / Deep Learning fundamentals and their practical applications.
Proven track record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.

Ways to stand out

Hands-on experience in large-scale AI training and deep understanding of core compute system concepts (latency/throughput bottlenecks, pipelining, multiprocessing) with demonstrated performance analysis and tuning.
Expertise in distributed computing, model parallelism, and mixed precision training.
Prior experience with Generative AI techniques applied to LLMs and Multi-Modal learning (text, image, video).
Knowledge of GPU/CPU architecture and related numerical software.
Contributions to open-source deep learning frameworks.

Compensation & Benefits

Base salary range: 224,000 USD - 356,500 USD (final base salary determined by location, experience, and pay of employees in similar positions).
Eligible for equity and other benefits.

Additional

NVIDIA emphasizes creativity, autonomy, and collaboration, and is an equal opportunity employer committed to diversity and inclusion.