AI System Engineer – New College Grad 2025

at Nvidia
USD 120,000-235,800 per year
JUNIOR
βœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 5 Algorithms @ 3 Machine Learning @ 3 Data Analysis @ 5 LLM @ 3 PyTorch @ 3 CUDA @ 3 GPU @ 3

Details

At NVIDIA, the AI/ML System Performance team works on next-generation inference optimizations to deliver industry-leading performance. The role focuses on investigating and prototyping scalable inference strategies to reduce per-token latency and maximize system throughput using cross-stack optimizations spanning algorithmic innovations, system-level techniques, and hardware-level enhancements. The team collaborates with deep learning research, framework development, compiler and systems engineering, and silicon architecture.

Sample projects referenced include Helix Parallelism and Disaggregated Inference.

Responsibilities

  • Optimize inference deployment by improving the trade-offs between accuracy, throughput, and interactivity at datacenter scale.
  • Develop high-fidelity performance models to prototype algorithmic techniques and hardware optimizations that drive model-hardware co-design for Generative AI.
  • Prioritize and recommend features to guide future software and hardware roadmaps based on performance modeling and analysis.
  • Model end-to-end performance impact of emerging GenAI workflows (e.g., Agentic Pipelines, inference-time compute scaling) to inform datacenter requirements.
  • Keep current with the latest deep learning research and collaborate with DL researchers, hardware architects, and software engineers.

Requirements

  • Pursuing or recently completed an MS or PhD (or equivalent experience) in Computer Science, Electrical Engineering, or related fields.
  • Strong background in computer architecture, roofline modeling, queuing theory, and statistical performance analysis techniques.
  • Solid understanding of machine learning fundamentals, model parallelism, and inference serving techniques.
  • Proficiency in Python for simulator design and data analysis. C++ is optional.

Preferred / Ways to Stand Out

  • Experience in system evaluation of AI/ML workloads or performance analysis, modeling, and optimizations for AI.
  • Comfortable defining metrics, designing experiments, and visualizing large performance datasets to identify resource bottlenecks.
  • Proven track record of working in cross-functional teams spanning algorithms, software, and hardware architecture.
  • Ability to distill complex analyses into clear recommendations for technical and non-technical stakeholders.
  • Experience with GPU computing (CUDA) or deep learning frameworks such as PyTorch, TRT-LLM, VLLM, SGLang.

Compensation & Benefits

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3. You will also be eligible for equity and benefits.

Additional Information

  • Applications for this job will be accepted at least until August 24, 2025.
  • NVIDIA is an equal opportunity employer committed to diversity and inclusion.