Senior Deep Learning Algorithm Engineer, Training Framework
at Nvidia
π Santa Clara, United States
USD 184,000-356,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 6 GitHub @ 4 Algorithms @ 4 Machine Learning @ 4 Mathematics @ 6 Performance Optimization @ 4 Debugging @ 6 API @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4Details
NVIDIA's NeMo Framework team is seeking engineers to design, develop, and optimize diverse real-world workloads for large-scale model training. NeMo is an open-source, scalable, cloud-native framework for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. The role focuses on expanding NeMo Framework capabilities across distributed training, model parallelism, performance optimization, APIs, toolkits, and libraries.
Responsibilities
- Develop algorithms for AI/Deep Learning, data analytics, machine learning, or scientific computing.
- Contribute to and advance the open-source NeMo Framework (https://github.com/nvidia-nemo).
- Solve large-scale, end-to-end AI training and inference challenges across the full model lifecycle: orchestration, data preprocessing, model training and tuning, and deployment.
- Work at the intersection of computer architecture, libraries, frameworks, AI applications, and the software stack.
- Innovate and improve model architectures, distributed training algorithms, and model parallel paradigms.
- Perform performance tuning and optimizations, including model training and fine-tuning with mixed precision recipes on next-generation NVIDIA GPU architectures.
- Research, prototype, and develop robust and scalable AI tools and pipelines.
Requirements
- MS, PhD or equivalent experience in Computer Science, AI, Applied Mathematics, or related field and 5+ years of industry experience.
- Experience with AI frameworks such as PyTorch and JAX.
- Experience with inference and deployment environments (examples listed: TRTLLM, vLLM, SGLang).
- Proficiency in Python programming, software design, debugging, performance analysis, test design, and documentation.
- Track record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.
- Strong understanding of AI/Deep Learning fundamentals and their practical applications.
Ways to Stand Out (Preferred / Nice to Have)
- Hands-on experience in large-scale AI training and deep understanding of compute system concepts (latency/throughput bottlenecks, pipelining, multiprocessing) with demonstrated performance analysis and tuning.
- Expertise in distributed computing and model parallelism.
- Expertise in mixed precision training.
- Prior experience with Generative AI techniques applied to LLMs and multimodal learning (text, image, video).
- Knowledge of GPU/CPU architecture and numerical software.
- Created or contributed to open-source deep learning frameworks.
Compensation & Benefits
- Base salary ranges (determined by location, experience, and comparable employees):
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
- You will also be eligible for equity and benefits (see NVIDIA benefits).
Other Details
- Location: Santa Clara, CA, United States (see Locations field).
- Employment type: Full time.
- Application window: Applications accepted at least until October 10, 2025.
- #LI-Hybrid
NVIDIA is an equal opportunity employer committed to fostering a diverse work environment and does not discriminate on the basis of protected characteristics.