Deep Learning Kernel Software Performance Architect - New College Grad 2026
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Python @ 5
Machine Learning @ 3
Parallel Programming @ 3
Debugging @ 3
CUDA @ 3
GPU @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA is seeking a Performance Architect for Deep Learning Software to develop processor and system architectures that accelerate machine learning, data analytics, and high-performance computing applications. This role involves validating and analyzing the performance of GPU-accelerated systems and software architectures, debugging deep learning and data analytics software to identify performance bottlenecks, and developing scripts and tools to analyze, visualize, and debug software using analytical models, simulators, and test suites. The role requires collaboration across NVIDIA teams including CUDA and AI Compiler teams, AI/ML training and inference performance teams, and hardware architecture performance teams.
Responsibilities
- Validate and analyze performance of GPU-accelerated system and software architectures that advance deep learning performance.
- Debug deep learning and data analytics software to identify root causes of performance bottlenecks.
- Develop scripts and tools to analyze, visualize, and debug software using analytical models, simulators, and test suites.
- Collaborate with CUDA and AI Compiler teams to pinpoint and resolve performance issues.
- Engage AI/ML training and inference performance teams to identify and optimize critical deep learning layers.
- Collaborate with hardware architecture performance teams to define expectations for emerging deep learning hardware features.
Requirements
- Master's or PhD in Computer Science, Electrical Engineering, Computer Engineering, or equivalent experience.
- Proven expertise in software design, including debugging, performance analysis, and test development.
- Hands-on experience with practical parallel programming (GPU experience helpful but not strictly required).
- Strong understanding of computer architecture with practical experience in performance debugging.
- Ability to identify bottlenecks, optimize resource utilization, and enhance system throughput.
- Fluency in programming languages such as Python, C, and C++.
Ways to stand out
- Strong foundation in machine learning and deep learning fundamentals.
- Background in high-performance, power-efficient designs; energy-efficient high-performance computing; performance analysis and profiling.
- Experience and familiarity with GPU computing and parallel programming models (CUDA familiarity noted).
- Experience with analytical performance modeling, profiling, and analysis.
Compensation and Other Details
- Base salary ranges (location and level dependent):
- Level 2: 124,000 USD - 195,500 USD
- Level 3: 152,000 USD - 241,500 USD
- Eligible for equity and benefits (link to NVIDIA benefits provided in original posting).
- Applications accepted at least until January 27, 2026.
- Position type: Full time.
Company and Equal Opportunity
NVIDIA highlights a long history in graphics, accelerated computing, and AI. The company states it values diversity and is an equal opportunity employer. NVIDIA uses AI tools in its recruiting processes.