Senior Research Scientist, Multi-Modal Language Models

at Nvidia
USD 192,000-356,500 per year
SENIOR
✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences

Python @ 6 Algorithms @ 4 Data Structures @ 4 Distributed Systems @ 4 LLM @ 4 PyTorch @ 6 Deep Learning @ 4 AI @ 4 Computer Vision @ 6

Details

We are seeking a Senior Research Scientist focused on multi-modal language models to drive Nemotron multi-modal technology and deliver state-of-the-art open-source multi-modal models. The team emphasizes open models, open weights, and open data, aiming for models that work well in real-world settings and uplift the multi-modal LLM ecosystem.

Responsibilities

  • Drive new abilities into multi-modal models.
  • Improve generalization of existing functionalities by identifying weak points, designing data synthesis solutions, and retraining models.
  • Develop recipes for training models that mix multiple modalities (text, image, video, audio, etc.).
  • Design solutions that improve Pareto efficiency.
  • Collaborate with researchers to translate cutting-edge ideas into production-ready implementations.
  • Explore new paradigms for evaluation.
  • Demonstrate strong engineering practices and contribute to open-source communities.

Requirements

  • PhD in Computer Science, Electrical Engineering, or related field, or equivalent research experience in LLMs, systems, or related areas.
  • 4+ years of experience in computer vision, especially multi-modal LLMs.
  • Proficiency in Python with hands-on experience in frameworks such as PyTorch.
  • Solid background in computer science fundamentals: algorithms, data structures, parallel/distributed computing, and systems programming.
  • Proven ability to collaborate across research and engineering teams in multifaceted environments.

Ways to stand out

  • Specific multi-modal LLM research experience.
  • Experience developing and scaling large distributed systems for deep learning.
  • Contributions to open-source LLM systems or large-scale AI infrastructure.

Compensation & Benefits

  • Base salary ranges (dependent on location/level/experience):
    • Level 4: 192,000 USD - 304,750 USD
    • Level 5: 224,000 USD - 356,500 USD
  • Eligible for equity and benefits (link to company benefits referenced in the posting).

Additional information

  • Location: Santa Clara, CA, United States.
  • Employment type: Full time.
  • Applications accepted at least until February 8, 2026.
  • NVIDIA uses AI tools in its recruiting processes.
  • NVIDIA is an equal opportunity employer and values diversity.