Senior Research Scientist, Multimodal Foundation Models and Robotics
at Nvidia
USD 184,000-356,500 per year
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Python @ 7
Algorithms @ 4
Machine Learning @ 4
TensorFlow @ 7
Hiring @ 4
PyTorch @ 7
CUDA @ 7
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
We are looking for a Senior Research Scientist focused on multimodal foundation models and robotics to join the Generalist Embodied Agent Research (GEAR) group at NVIDIA. The mission is to build general-purpose embodied agents and humanoid robot foundation models that learn to explore and master complex skills across virtual and physical worlds. The team produces work on multimodal foundation models, large-scale robot learning, game AI, and physical simulation (examples: Eureka, VIMA, Voyager, MineDojo, MimicPlay, Prismer, Project GR00T).
Responsibilities
- Design and implement novel AI algorithms and models for general-purpose humanoid robots and embodied agents.
- Develop large-scale AI training and inference methods for foundation models.
- Optimize and deploy AI models in physical simulation and on robot hardware.
- Collaborate with research and engineering teams across NVIDIA to transfer research into products and services.
Requirements
Education
- Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, or equivalent research experience.
Experience & Technical Skills
- Minimum ~5 years of relevant work/research experience across multimodal foundation models and/or robotics.
Multimodal Foundation Models:
- Hands-on training experience and publications in topics such as large language models (LLMs), large vision-language models, video generative models and diffusion algorithms, or action-based transformers.
- Strong engineering skills in rapid prototyping and model training frameworks (PyTorch, Jax, TensorFlow, etc.). Python is required; C++ and CUDA proficiency are a strong plus.
- Experience working with large-scale machine learning/AI systems and compute infrastructure.
Robotics:
- Hands-on training experience and publications in robot learning (reinforcement learning, imitation learning, classical control methods).
- Strong programming skills in Python and C++; familiarity with ROS and ML frameworks like PyTorch.
- Deep understanding of robot kinematics, dynamics, sensors, and control methods (PID, model predictive control, whole-body control).
- Ability to safely operate robot hardware, lab equipment, and tools; experience with robot hardware design and hands-on building.
- Familiarity with physics simulation frameworks such as MuJoCo and Isaac Sim.
Benefits
- Base salary range (varies by level and location):
- Level 4: 184,000 USD - 299,000 USD
- Level 5: 224,000 USD - 356,500 USD
- Eligible for equity and additional benefits (see NVIDIA benefits information).
- Applications accepted at least until November 9, 2025.
- NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.