Senior High-Performance LLM Training Engineer

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 4 LLM @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 3

Details

NVIDIA is seeking experienced engineers specializing in performance analysis and optimization to improve the efficiency of LLM training workloads, which are shaping the world's most advanced computing systems. This position focuses on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for high-performance training on thousands of GPUs, while also helping shape hardware roadmaps for the next generation of GPUs powering the AI revolution.

Responsibilities

  • Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.
  • Understand the big picture of training performance on GPUs, prioritizing and then solving problems across all state-of-the-art neural networks.
  • Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to deep learning frameworks.
  • Build and support NVIDIA submissions to the MLPerf Training benchmark suite.
  • Implement key deep learning training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.
  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

Requirements

  • PhD in Computer Science, Electrical Engineering or Computer Engineering and 5+ years; or MS (or equivalent experience) and 8+ years of meaningful work experience.
  • Strong background in deep learning and neural networks, particularly training.
  • Deep background in computer architecture and familiarity with GPU architecture fundamentals.
  • Proven experience analyzing and tuning application performance and processor and system-level performance modeling.
  • Programming skills in C++, Python, and CUDA.

Benefits

  • Highly competitive salary (base salary range is 184,000 USD - 356,500 USD).
  • Eligibility for equity and comprehensive benefits package.
  • Opportunity to collaborate with forward-thinking and hard-working people shaping the future of AI.
  • Creative and autonomous work environment that encourages innovation.

NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer.