Senior GenAI Framework Engineer
at Nvidia
π Santa Clara, United States
USD 184,000-356,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 6 Algorithms @ 4 Debugging @ 6 API @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4Details
NVIDIA's NeMo Framework team is seeking engineers to design, develop, and optimize diverse real-world GenAI workloads. NeMo is an open-source, scalable, cloud-native framework for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. The framework provides end-to-end model training, including pretraining, reasoning, alignment, customization, evaluation, deployment, and tooling to optimize performance and user experience.
Responsibilities
- Expand NeMo Framework capabilities to enable users to develop, train, and optimize models.
- Design and implement distributed training algorithms and model parallel paradigms.
- Define robust APIs and expand toolkits and libraries for coherence and comprehensiveness.
- Meticulously analyze and tune performance across the stack (from orchestration and data preprocessing to training, tuning, and deployment).
- Contribute to the open-source NeMo Framework.
- Innovate and improve model architectures and training algorithms.
- Performance tuning and optimizations, including model training and finetuning with mixed precision recipes on next-generation NVIDIA GPU architectures.
- Research, prototype, and develop robust and scalable AI tools and pipelines.
- Collaborate with internal partners, users, and the open-source community to analyze, design, and implement optimized solutions.
Requirements
- MS, PhD or equivalent experience in Computer Science, AI, Applied Math, or a related field and 5+ years of industry experience.
- Experience with AI frameworks (e.g., PyTorch, JAX) and/or inference and deployment environments (examples given: TRTLLM, vLLM, SGLang).
- Proficient in Python programming, software design, debugging, performance analysis, test design, and documentation.
- Strong understanding of AI/deep-learning fundamentals and practical applications.
- Consistent record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.
Ways to stand out
- Hands-on experience in large-scale AI training with deep understanding of core compute system concepts (latency/throughput bottlenecks, pipelining, multiprocessing) and demonstrated excellence in performance analysis and tuning.
- Expertise in distributed computing, model parallelism, and mixed precision training.
- Prior experience with generative AI techniques applied to LLM and multi-modal learning (text, image, video).
- Knowledge of GPU/CPU architecture and related numerical software.
- Created or contributed to open-source deep learning frameworks.
Compensation & Benefits
- Base salary ranges by level:
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
- Eligible for equity and benefits.
Additional Information
- Location: Santa Clara, CA, United States. #LI-Hybrid
- Applications accepted until August 23, 2025.
- NVIDIA is an equal opportunity employer committed to diversity and inclusion.