Systems Performance Engineer, Agentic AI Workloads – New College Grad 2026

at Nvidia

📍 Santa Clara, United States

USD 124,000-241,500 per year

JUNIOR MIDDLE

✅ On-site

Used Tools & Technologies

Machine Learning

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Python @ 3 Statistics @ 6 Mathematics @ 3 LLM @ 3 Deep Learning @ 3 AI @ 3 Data Pipelines @ 3 Performance Analysis @ 6

Details

NVIDIA is looking for a Deep Learning Architect to join our team working at the cutting edge of AI infrastructure. As agentic LLM workloads reshape the demands placed on modern datacenters, we need engineers who can model, simulate, and reason about complex system-level traffic at scale. If you have a passion for performance analysis, a strong quantitative foundation, and excitement about the future of AI systems, we'd love to talk.

Responsibilities

Develop and extend C++ and Python simulators that model system-level network and compute traffic for agentic LLM workloads in datacenter environments
Characterize real-world LLM serving workloads and distill them into representative simulator inputs
Run simulations at scale and apply statistical techniques to post-process and interpret results
Identify performance bottlenecks and translate findings into concrete architectural recommendations
Collaborate with hardware, software, and research teams to influence the design of future AI systems

Requirements

Pursuing or recently completed an MS or PhD in Computer Science, Electrical Engineering, Mathematics, or a related field (or equivalent experience)
Strong programming skills in C++ and Python
Solid foundations in queueing theory and traffic modeling (e.g., Erlang models, Little's Law)
Strong statistics background: characterize huge datasets with percentiles, distributions, and clustering techniques such as K-means
Understanding of deep learning fundamentals, large language models (LLMs), and modern inference serving frameworks

Ways to stand out

Hands-on experience with traffic or network simulators, even in an academic or course project context
Familiarity with roofline modeling and performance scaling of deep learning models at the kernel level
Experience running large-scale simulation campaigns and building data pipelines to process and visualize results
Prior work characterizing or benchmarking ML inference workloads

Compensation & Benefits

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 124,000 USD - 195,500 USD for Level 2, and 152,000 USD - 241,500 USD for Level 3. You will also be eligible for equity and benefits.

Additional information

Applications for this job will be accepted at least until June 7, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is an equal opportunity employer and is committed to fostering an inclusive work environment.