Senior High-Performance LLM Training Engineer

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 4 LLM @ 4 PyTorch @ 6 CUDA @ 4 GPU @ 3

Details

We are now looking for a Senior High-Performance LLM Training Engineer at NVIDIA focused on performance analysis and optimization to improve the efficiency of LLM training workloads. This role centers on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for large-scale training on thousands of GPUs, while influencing hardware roadmaps for future GPUs.

Responsibilities

  • Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.
  • Understand the big picture of training performance on GPUs; prioritize and solve performance problems across state-of-the-art neural networks.
  • Implement production-quality software across multiple layers of NVIDIA's deep learning platform stack, from drivers to deep-learning frameworks.
  • Build and support NVIDIA submissions to the MLPerf Training benchmark suite.
  • Implement key deep-learning training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.
  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

Requirements

  • PhD in Computer Science, Electrical Engineering, or Computer Engineering plus 5+ years of experience; or MS (or equivalent experience) plus 8+ years of meaningful work experience.
  • Strong background in deep learning and neural networks, particularly training.
  • A deep background in computer architecture and familiarity with GPU architecture fundamentals.
  • Proven experience analyzing and tuning application performance and processor/system-level performance modeling.
  • Programming skills in C++, Python, and CUDA.
  • Experience with deep-learning frameworks such as PyTorch and JAX and familiarity with MLPerf Training benchmarks.

Technologies and Tools Mentioned

  • PyTorch, JAX
  • C++, Python, CUDA
  • MLPerf Training benchmark suite
  • GPU architecture, processor and system simulators
  • Drivers and deep learning framework stacks

Benefits

  • Competitive base salary (ranges below) determined by location, experience, and internal pay equity.
  • Eligibility for equity and comprehensive benefits. (Link to NVIDIA benefits available in original posting.)
  • Opportunity to work across the full hardware and software stack and collaborate with cross-functional teams shaping future AI systems.

Additional Details

  • Office policy: Hybrid (#LI-Hybrid indicated).
  • Location: Santa Clara, CA, United States.
  • Application acceptance at least until July 29, 2025.
  • NVIDIA is an equal opportunity employer committed to diversity and non-discrimination.

Salary Ranges (as listed)

  • Level 4 base salary range: 184,000 USD - 287,500 USD
  • Level 5 base salary range: 224,000 USD - 356,500 USD