Senior GenAI Framework Engineer

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 6 Algorithms @ 4 Debugging @ 6 API @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4

Details

NVIDIA's NeMo Framework team is seeking engineers to design, develop, and optimize diverse real-world GenAI workloads. NeMo is an open-source, scalable, cloud-native framework for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. The framework provides end-to-end model training, including pretraining, reasoning, alignment, customization, evaluation, deployment, and tooling to optimize performance and user experience.

Responsibilities

Expand NeMo Framework capabilities to enable users to develop, train, and optimize models.
Design and implement distributed training algorithms and model parallel paradigms.
Define robust APIs and expand toolkits and libraries for coherence and comprehensiveness.
Meticulously analyze and tune performance across the stack (from orchestration and data preprocessing to training, tuning, and deployment).
Contribute to the open-source NeMo Framework.
Innovate and improve model architectures and training algorithms.
Performance tuning and optimizations, including model training and finetuning with mixed precision recipes on next-generation NVIDIA GPU architectures.
Research, prototype, and develop robust and scalable AI tools and pipelines.
Collaborate with internal partners, users, and the open-source community to analyze, design, and implement optimized solutions.

Requirements

MS, PhD or equivalent experience in Computer Science, AI, Applied Math, or a related field and 5+ years of industry experience.
Experience with AI frameworks (e.g., PyTorch, JAX) and/or inference and deployment environments (examples given: TRTLLM, vLLM, SGLang).
Proficient in Python programming, software design, debugging, performance analysis, test design, and documentation.
Strong understanding of AI/deep-learning fundamentals and practical applications.
Consistent record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.

Ways to stand out

Hands-on experience in large-scale AI training with deep understanding of core compute system concepts (latency/throughput bottlenecks, pipelining, multiprocessing) and demonstrated excellence in performance analysis and tuning.
Expertise in distributed computing, model parallelism, and mixed precision training.
Prior experience with generative AI techniques applied to LLM and multi-modal learning (text, image, video).
Knowledge of GPU/CPU architecture and related numerical software.
Created or contributed to open-source deep learning frameworks.

Compensation & Benefits

Base salary ranges by level:
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
Eligible for equity and benefits.

Additional Information

Location: Santa Clara, CA, United States. #LI-Hybrid
Applications accepted until August 23, 2025.
NVIDIA is an equal opportunity employer committed to diversity and inclusion.