Applied AI Research Engineering Intern - Fall 2025

at Nvidia

📍 Santa Clara, United States

USD 18-71 per hour

INTERN

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Software Development @ 3 Kubernetes @ 3 Python @ 3 GitHub @ 3 Algorithms @ 3 Data Structures @ 3 Machine Learning @ 3 Communication @ 3 Rust @ 3 Debugging @ 3 API @ 3 NLP @ 3 LLM @ 3 GPU @ 3

Details

NVIDIA Dynamo is an innovative, open-source platform focused on efficient, scalable inference for large language and reasoning models in distributed GPU environments. By leveraging sophisticated techniques in serving architecture, GPU resource management, and intelligent request handling, Dynamo delivers high-performance AI inference for demanding applications. The team addresses challenging issues in distributed AI infrastructure, building the next generation of scalable AI systems.

Responsibilities

Collaborate on the design and development of the Dynamo Kubernetes stack.
Introduce new features to the Dynamo Python SDK and Dynamo Rust Runtime Core Library.
Design, implement, and optimize distributed inference components in Rust and Python.
Contribute to the development of disaggregated serving for Dynamo-supported inference engines including vLLM, SGLang, TRT-LLM, llama.cpp, and mistral.rs.
Improve intelligent routing and key-value cache management subsystems.
Contribute to open-source repositories, participate in code reviews, assist with issue triage on GitHub.
Work closely with community to address issues, capture feedback, and evolve the framework’s APIs and architecture.

Requirements

Pursuing a Bachelor's or Master's degree in Computer Science or a related field.
Excellent programming and software design skills in Golang, Rust, and/or Python, including debugging, performance and service health analysis, and test design.
Good understanding of algorithms and data structures.
Solid knowledge of RESTful APIs.
Highly motivated, dedicated, curious about new technologies with excellent communication, planning, and problem-solving skills.

Ways To Stand Out From The Crowd

Understanding of machine learning or NLP concepts.
Experience in software shipping cycles (development, deployment, release, CI) and open-source software development.
Experience with inference engines such as vLLM, SGLang, TensorRT-LLM.
Experience building and deploying containers in Kubernetes environments.

Benefits

Internship hourly rate ranging from 18 USD to 71 USD, based on position, location, year in school, degree, and experience.
Eligibility for NVIDIA intern benefits.
Commitment to diversity and equal opportunity employment.