AI System Engineer β New College Grad 2025
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 5 Algorithms @ 3 Machine Learning @ 3 Data Analysis @ 5 LLM @ 3 PyTorch @ 3 CUDA @ 3 GPU @ 3Details
At NVIDIA, the AI/ML System Performance team works on next-generation inference optimizations to deliver industry-leading performance. The role focuses on investigating and prototyping scalable inference strategies to reduce per-token latency and maximize system throughput using cross-stack optimizations spanning algorithmic innovations, system-level techniques, and hardware-level enhancements. The team collaborates with deep learning research, framework development, compiler and systems engineering, and silicon architecture.
Sample projects referenced include Helix Parallelism and Disaggregated Inference.
Responsibilities
- Optimize inference deployment by improving the trade-offs between accuracy, throughput, and interactivity at datacenter scale.
- Develop high-fidelity performance models to prototype algorithmic techniques and hardware optimizations that drive model-hardware co-design for Generative AI.
- Prioritize and recommend features to guide future software and hardware roadmaps based on performance modeling and analysis.
- Model end-to-end performance impact of emerging GenAI workflows (e.g., Agentic Pipelines, inference-time compute scaling) to inform datacenter requirements.
- Keep current with the latest deep learning research and collaborate with DL researchers, hardware architects, and software engineers.
Requirements
- Pursuing or recently completed an MS or PhD (or equivalent experience) in Computer Science, Electrical Engineering, or related fields.
- Strong background in computer architecture, roofline modeling, queuing theory, and statistical performance analysis techniques.
- Solid understanding of machine learning fundamentals, model parallelism, and inference serving techniques.
- Proficiency in Python for simulator design and data analysis. C++ is optional.
Preferred / Ways to Stand Out
- Experience in system evaluation of AI/ML workloads or performance analysis, modeling, and optimizations for AI.
- Comfortable defining metrics, designing experiments, and visualizing large performance datasets to identify resource bottlenecks.
- Proven track record of working in cross-functional teams spanning algorithms, software, and hardware architecture.
- Ability to distill complex analyses into clear recommendations for technical and non-technical stakeholders.
- Experience with GPU computing (CUDA) or deep learning frameworks such as PyTorch, TRT-LLM, VLLM, SGLang.
Compensation & Benefits
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3. You will also be eligible for equity and benefits.
Additional Information
- Applications for this job will be accepted at least until August 24, 2025.
- NVIDIA is an equal opportunity employer committed to diversity and inclusion.