Deep Learning Kernel Software Performance Architect - New College Grad 2026

at Nvidia
USD 124,000-241,500 per year
JUNIOR
✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences

Python @ 5 Machine Learning @ 3 Parallel Programming @ 3 Debugging @ 3 CUDA @ 3 GPU @ 3

Details

NVIDIA is seeking a Performance Architect for Deep Learning Software to develop processor and system architectures that accelerate machine learning, data analytics, and high-performance computing applications. This role involves validating and analyzing the performance of GPU-accelerated systems and software architectures, debugging deep learning and data analytics software to identify performance bottlenecks, and developing scripts and tools to analyze, visualize, and debug software using analytical models, simulators, and test suites. The role requires collaboration across NVIDIA teams including CUDA and AI Compiler teams, AI/ML training and inference performance teams, and hardware architecture performance teams.

Responsibilities

  • Validate and analyze performance of GPU-accelerated system and software architectures that advance deep learning performance.
  • Debug deep learning and data analytics software to identify root causes of performance bottlenecks.
  • Develop scripts and tools to analyze, visualize, and debug software using analytical models, simulators, and test suites.
  • Collaborate with CUDA and AI Compiler teams to pinpoint and resolve performance issues.
  • Engage AI/ML training and inference performance teams to identify and optimize critical deep learning layers.
  • Collaborate with hardware architecture performance teams to define expectations for emerging deep learning hardware features.

Requirements

  • Master's or PhD in Computer Science, Electrical Engineering, Computer Engineering, or equivalent experience.
  • Proven expertise in software design, including debugging, performance analysis, and test development.
  • Hands-on experience with practical parallel programming (GPU experience helpful but not strictly required).
  • Strong understanding of computer architecture with practical experience in performance debugging.
  • Ability to identify bottlenecks, optimize resource utilization, and enhance system throughput.
  • Fluency in programming languages such as Python, C, and C++.

Ways to stand out

  • Strong foundation in machine learning and deep learning fundamentals.
  • Background in high-performance, power-efficient designs; energy-efficient high-performance computing; performance analysis and profiling.
  • Experience and familiarity with GPU computing and parallel programming models (CUDA familiarity noted).
  • Experience with analytical performance modeling, profiling, and analysis.

Compensation and Other Details

  • Base salary ranges (location and level dependent):
    • Level 2: 124,000 USD - 195,500 USD
    • Level 3: 152,000 USD - 241,500 USD
  • Eligible for equity and benefits (link to NVIDIA benefits provided in original posting).
  • Applications accepted at least until January 27, 2026.
  • Position type: Full time.

Company and Equal Opportunity

NVIDIA highlights a long history in graphics, accelerated computing, and AI. The company states it values diversity and is an equal opportunity employer. NVIDIA uses AI tools in its recruiting processes.