Senior Research Scientist, Multimodal Foundation Models and Robotics

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 7 Algorithms @ 4 Machine Learning @ 4 TensorFlow @ 7 Hiring @ 4 PyTorch @ 7 CUDA @ 7

Details

We are looking for a Senior Research Scientist focused on multimodal foundation models and robotics to join the Generalist Embodied Agent Research (GEAR) group at NVIDIA. The mission is to build general-purpose embodied agents and humanoid robot foundation models that learn to explore and master complex skills across virtual and physical worlds. The team produces work on multimodal foundation models, large-scale robot learning, game AI, and physical simulation (examples: Eureka, VIMA, Voyager, MineDojo, MimicPlay, Prismer, Project GR00T).

Responsibilities

Design and implement novel AI algorithms and models for general-purpose humanoid robots and embodied agents.
Develop large-scale AI training and inference methods for foundation models.
Optimize and deploy AI models in physical simulation and on robot hardware.
Collaborate with research and engineering teams across NVIDIA to transfer research into products and services.

Requirements

Education

Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, or equivalent research experience.

Experience & Technical Skills

Minimum ~5 years of relevant work/research experience across multimodal foundation models and/or robotics.

Multimodal Foundation Models:

Hands-on training experience and publications in topics such as large language models (LLMs), large vision-language models, video generative models and diffusion algorithms, or action-based transformers.
Strong engineering skills in rapid prototyping and model training frameworks (PyTorch, Jax, TensorFlow, etc.). Python is required; C++ and CUDA proficiency are a strong plus.
Experience working with large-scale machine learning/AI systems and compute infrastructure.

Robotics:

Hands-on training experience and publications in robot learning (reinforcement learning, imitation learning, classical control methods).
Strong programming skills in Python and C++; familiarity with ROS and ML frameworks like PyTorch.
Deep understanding of robot kinematics, dynamics, sensors, and control methods (PID, model predictive control, whole-body control).
Ability to safely operate robot hardware, lab equipment, and tools; experience with robot hardware design and hands-on building.
Familiarity with physics simulation frameworks such as MuJoCo and Isaac Sim.

Benefits

Base salary range (varies by level and location):
- Level 4: 184,000 USD - 299,000 USD
- Level 5: 224,000 USD - 356,500 USD
Eligible for equity and additional benefits (see NVIDIA benefits information).
Applications accepted at least until November 9, 2025.
NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.