Senior High-Performance LLM Training Engineer

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 4 LLM @ 4 PyTorch @ 6 CUDA @ 4 GPU @ 3

Details

We are now looking for a Senior High-Performance LLM Training Engineer at NVIDIA focused on performance analysis and optimization to improve the efficiency of LLM training workloads. This role centers on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for large-scale training on thousands of GPUs, while influencing hardware roadmaps for future GPUs.

Responsibilities

Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.
Understand the big picture of training performance on GPUs; prioritize and solve performance problems across state-of-the-art neural networks.
Implement production-quality software across multiple layers of NVIDIA's deep learning platform stack, from drivers to deep-learning frameworks.
Build and support NVIDIA submissions to the MLPerf Training benchmark suite.
Implement key deep-learning training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.
Build tools to automate workload analysis, workload optimization, and other critical workflows.

Requirements

PhD in Computer Science, Electrical Engineering, or Computer Engineering plus 5+ years of experience; or MS (or equivalent experience) plus 8+ years of meaningful work experience.
Strong background in deep learning and neural networks, particularly training.
A deep background in computer architecture and familiarity with GPU architecture fundamentals.
Proven experience analyzing and tuning application performance and processor/system-level performance modeling.
Programming skills in C++, Python, and CUDA.
Experience with deep-learning frameworks such as PyTorch and JAX and familiarity with MLPerf Training benchmarks.

Technologies and Tools Mentioned

PyTorch, JAX
C++, Python, CUDA
MLPerf Training benchmark suite
GPU architecture, processor and system simulators
Drivers and deep learning framework stacks

Benefits

Competitive base salary (ranges below) determined by location, experience, and internal pay equity.
Eligibility for equity and comprehensive benefits. (Link to NVIDIA benefits available in original posting.)
Opportunity to work across the full hardware and software stack and collaborate with cross-functional teams shaping future AI systems.

Additional Details

Office policy: Hybrid (#LI-Hybrid indicated).
Location: Santa Clara, CA, United States.
Application acceptance at least until July 29, 2025.
NVIDIA is an equal opportunity employer committed to diversity and non-discrimination.

Salary Ranges (as listed)

Level 4 base salary range: 184,000 USD - 287,500 USD
Level 5 base salary range: 224,000 USD - 356,500 USD