AI System Engineer – New College Grad 2025

at Nvidia

📍 Santa Clara, United States

USD 120,000-235,800 per year

JUNIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 5 Algorithms @ 3 Machine Learning @ 3 Data Analysis @ 5 LLM @ 3 PyTorch @ 3 CUDA @ 3 GPU @ 3

Details

At NVIDIA, the AI/ML System Performance team works on next-generation inference optimizations to deliver industry-leading performance. The role focuses on investigating and prototyping scalable inference strategies to reduce per-token latency and maximize system throughput using cross-stack optimizations spanning algorithmic innovations, system-level techniques, and hardware-level enhancements. The team collaborates with deep learning research, framework development, compiler and systems engineering, and silicon architecture.

Sample projects referenced include Helix Parallelism and Disaggregated Inference.

Responsibilities

Optimize inference deployment by improving the trade-offs between accuracy, throughput, and interactivity at datacenter scale.
Develop high-fidelity performance models to prototype algorithmic techniques and hardware optimizations that drive model-hardware co-design for Generative AI.
Prioritize and recommend features to guide future software and hardware roadmaps based on performance modeling and analysis.
Model end-to-end performance impact of emerging GenAI workflows (e.g., Agentic Pipelines, inference-time compute scaling) to inform datacenter requirements.
Keep current with the latest deep learning research and collaborate with DL researchers, hardware architects, and software engineers.

Requirements

Pursuing or recently completed an MS or PhD (or equivalent experience) in Computer Science, Electrical Engineering, or related fields.
Strong background in computer architecture, roofline modeling, queuing theory, and statistical performance analysis techniques.
Solid understanding of machine learning fundamentals, model parallelism, and inference serving techniques.
Proficiency in Python for simulator design and data analysis. C++ is optional.

Preferred / Ways to Stand Out

Experience in system evaluation of AI/ML workloads or performance analysis, modeling, and optimizations for AI.
Comfortable defining metrics, designing experiments, and visualizing large performance datasets to identify resource bottlenecks.
Proven track record of working in cross-functional teams spanning algorithms, software, and hardware architecture.
Ability to distill complex analyses into clear recommendations for technical and non-technical stakeholders.
Experience with GPU computing (CUDA) or deep learning frameworks such as PyTorch, TRT-LLM, VLLM, SGLang.

Compensation & Benefits

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3. You will also be eligible for equity and benefits.

Additional Information

Applications for this job will be accepted at least until August 24, 2025.
NVIDIA is an equal opportunity employer committed to diversity and inclusion.