Used Tools & Technologies
Not specified
Required Skills & Competences ?
Software Development @ 4 Python @ 3 Machine Learning @ 7 Leadership @ 4 Communication @ 7 Parallel Programming @ 4 Performance Optimization @ 1 Jira @ 4 Product Management @ 4 Technical Leadership @ 4 Project Management @ 4 NLP @ 4 Agile @ 4 CUDA @ 4 GPU @ 4Details
NVIDIA Math Libraries team is looking for a senior engineer to join development efforts in kernel generation for AI and HPC, specifically targeting matrix operations, JITing, and fusions. The team develops GPU-accelerated mathematical libraries used across AI, scientific and engineering simulations, and data analytics in domains such as healthcare, NLP, VR, deep learning, and autonomous vehicles.
Full time. Location: Santa Clara, CA, United States. Applications accepted at least until July 29, 2025.
Responsibilities
- Scope, design, and implement high-quality, high-performance numerical dense linear algebra software on GPUs.
- Own execution of projects involving multiple engineers and sometimes multiple teams.
- Provide technical leadership and feedback to library engineers and occasionally mentor interns.
- Work closely with product management and internal/external customers to understand feature and performance requirements and contribute to technical roadmaps of libraries.
- Find opportunities to improve library performance and reduce code maintenance overhead through re-architecting.
- Explain complex solutions, exercise leadership, and coordinate with multiple teams to achieve goals.
Requirements
- PhD, Master’s, or Bachelor’s degree in Computer Science, Applied Math, or related science or engineering field (or equivalent experience).
- 8+ years of experience designing, developing, testing, maintaining, and performance-optimizing HPC software using C++.
- Strong fundamentals in kernel generation and composable library design for linear algebra.
- Leadership skills in driving software development projects.
- Strong collaboration, communication, and documentation habits.
- Kernel generation experience. JIT focus/experience desired.
Preferred / Ways to stand out
- Experience with parallel programming, ideally using CUDA, MPI, OpenMP, OpenACC, or pthreads.
- Good understanding of machine learning and deep learning technologies and knowledge of GPU (preferred) or CPU hardware architecture.
- Experience with low-level programming (assembly) for performance optimization and operator fusion is a strong plus.
- Experience with agile software development practices and project management tools such as JIRA.
- Familiarity with a scripting language, preferably Python.
Compensation & Benefits
- Base salary range (determined by location, experience, and comparable pay):
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
- Eligible for equity and company benefits (see NVIDIA benefits).
Equal Opportunity
NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. NVIDIA does not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.