Senior Deep Learning Algorithm Engineer, Training Framework
    at Nvidia
  
  
    
      π Santa Clara, United States
    
  
  
    
      
      
        USD 184,000-356,500 per year
      
    
    
  
  
    
  
  
  SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 6 GitHub @ 4 Algorithms @ 4 Machine Learning @ 4 Mathematics @ 6 Performance Optimization @ 4 Debugging @ 6 API @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4Details
NVIDIA's NeMo Framework team is seeking engineers to design, develop, and optimize diverse real-world workloads for large-scale model training. NeMo is an open-source, scalable, cloud-native framework for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. The role focuses on expanding NeMo Framework capabilities across distributed training, model parallelism, performance optimization, APIs, toolkits, and libraries.
Responsibilities
- Develop algorithms for AI/Deep Learning, data analytics, machine learning, or scientific computing.
 - Contribute to and advance the open-source NeMo Framework (https://github.com/nvidia-nemo).
 - Solve large-scale, end-to-end AI training and inference challenges across the full model lifecycle: orchestration, data preprocessing, model training and tuning, and deployment.
 - Work at the intersection of computer architecture, libraries, frameworks, AI applications, and the software stack.
 - Innovate and improve model architectures, distributed training algorithms, and model parallel paradigms.
 - Perform performance tuning and optimizations, including model training and fine-tuning with mixed precision recipes on next-generation NVIDIA GPU architectures.
 - Research, prototype, and develop robust and scalable AI tools and pipelines.
 
Requirements
- MS, PhD or equivalent experience in Computer Science, AI, Applied Mathematics, or related field and 5+ years of industry experience.
 - Experience with AI frameworks such as PyTorch and JAX.
 - Experience with inference and deployment environments (examples listed: TRTLLM, vLLM, SGLang).
 - Proficiency in Python programming, software design, debugging, performance analysis, test design, and documentation.
 - Track record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.
 - Strong understanding of AI/Deep Learning fundamentals and their practical applications.
 
Ways to Stand Out (Preferred / Nice to Have)
- Hands-on experience in large-scale AI training and deep understanding of compute system concepts (latency/throughput bottlenecks, pipelining, multiprocessing) with demonstrated performance analysis and tuning.
 - Expertise in distributed computing and model parallelism.
 - Expertise in mixed precision training.
 - Prior experience with Generative AI techniques applied to LLMs and multimodal learning (text, image, video).
 - Knowledge of GPU/CPU architecture and numerical software.
 - Created or contributed to open-source deep learning frameworks.
 
Compensation & Benefits
- Base salary ranges (determined by location, experience, and comparable employees):
- Level 4: 184,000 USD - 287,500 USD
 - Level 5: 224,000 USD - 356,500 USD
 
 - You will also be eligible for equity and benefits (see NVIDIA benefits).
 
Other Details
- Location: Santa Clara, CA, United States (see Locations field).
 - Employment type: Full time.
 - Application window: Applications accepted at least until October 10, 2025.
 - #LI-Hybrid
 
NVIDIA is an equal opportunity employer committed to fostering a diverse work environment and does not discriminate on the basis of protected characteristics.