Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 4 Hiring @ 4 LLM @ 4 PyTorch @ 6 CUDA @ 4 GPU @ 3Details
We are looking for a Senior High-Performance LLM Training Engineer to improve the efficiency of LLM training workloads across NVIDIA's hardware and software stack. The role focuses on optimizing high-performance training in frameworks like PyTorch and JAX for training on thousands of GPUs, and on influencing hardware roadmaps for next-generation GPUs.
Responsibilities
- Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.
 - Understand the big picture of training performance on GPUs; prioritize and solve performance problems across state-of-the-art neural networks.
 - Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to deep learning frameworks.
 - Build and support NVIDIA submissions to the MLPerf Training benchmark suite.
 - Implement key deep-learning training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.
 - Build tools to automate workload analysis, workload optimization, and other critical workflows.
 
Requirements
- PhD in Computer Science, Electrical Engineering or Computer Engineering and 5+ years; or MS (or equivalent experience) and 8+ years of meaningful work experience.
 - Strong background in deep learning and neural networks, particularly training.
 - A deep background in computer architecture and familiarity with fundamentals of GPU architecture.
 - Proven experience analyzing and tuning application performance and processor and system-level performance modeling.
 - Programming skills in C++, Python, and CUDA.
 
Compensation and Benefits
- Base salary ranges (determined by location, experience, and comparable employees):
- Level 4: 184,000 USD - 287,500 USD
 - Level 5: 224,000 USD - 356,500 USD
 
 - Eligible for equity and benefits.
 
Additional Details
- Work arrangement: Hybrid
 - Location: Santa Clara, CA, United States
 - Applications accepted at least until July 29, 2025.
 - NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.