Senior Deep Learning Algorithm Engineer, Training Framework

at Nvidia
USD 184,000-356,500 per year
SENIOR
βœ… Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 6 GitHub @ 4 Algorithms @ 4 Machine Learning @ 4 Mathematics @ 6 Performance Optimization @ 4 Debugging @ 6 API @ 4 LLM @ 4 PyTorch @ 4 GPU @ 4

Details

NVIDIA's NeMo Framework team is seeking engineers to design, develop, and optimize diverse real-world workloads for large-scale model training. NeMo is an open-source, scalable, cloud-native framework for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. The role focuses on expanding NeMo Framework capabilities across distributed training, model parallelism, performance optimization, APIs, toolkits, and libraries.

Responsibilities

  • Develop algorithms for AI/Deep Learning, data analytics, machine learning, or scientific computing.
  • Contribute to and advance the open-source NeMo Framework (https://github.com/nvidia-nemo).
  • Solve large-scale, end-to-end AI training and inference challenges across the full model lifecycle: orchestration, data preprocessing, model training and tuning, and deployment.
  • Work at the intersection of computer architecture, libraries, frameworks, AI applications, and the software stack.
  • Innovate and improve model architectures, distributed training algorithms, and model parallel paradigms.
  • Perform performance tuning and optimizations, including model training and fine-tuning with mixed precision recipes on next-generation NVIDIA GPU architectures.
  • Research, prototype, and develop robust and scalable AI tools and pipelines.

Requirements

  • MS, PhD or equivalent experience in Computer Science, AI, Applied Mathematics, or related field and 5+ years of industry experience.
  • Experience with AI frameworks such as PyTorch and JAX.
  • Experience with inference and deployment environments (examples listed: TRTLLM, vLLM, SGLang).
  • Proficiency in Python programming, software design, debugging, performance analysis, test design, and documentation.
  • Track record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.
  • Strong understanding of AI/Deep Learning fundamentals and their practical applications.

Ways to Stand Out (Preferred / Nice to Have)

  • Hands-on experience in large-scale AI training and deep understanding of compute system concepts (latency/throughput bottlenecks, pipelining, multiprocessing) with demonstrated performance analysis and tuning.
  • Expertise in distributed computing and model parallelism.
  • Expertise in mixed precision training.
  • Prior experience with Generative AI techniques applied to LLMs and multimodal learning (text, image, video).
  • Knowledge of GPU/CPU architecture and numerical software.
  • Created or contributed to open-source deep learning frameworks.

Compensation & Benefits

  • Base salary ranges (determined by location, experience, and comparable employees):
    • Level 4: 184,000 USD - 287,500 USD
    • Level 5: 224,000 USD - 356,500 USD
  • You will also be eligible for equity and benefits (see NVIDIA benefits).

Other Details

  • Location: Santa Clara, CA, United States (see Locations field).
  • Employment type: Full time.
  • Application window: Applications accepted at least until October 10, 2025.
  • #LI-Hybrid

NVIDIA is an equal opportunity employer committed to fostering a diverse work environment and does not discriminate on the basis of protected characteristics.