Senior System Software Engineer, Performance - CUDA Driver

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ Hybrid

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Linux @ 4 Hiring @ 4 Communication @ 4 Debugging @ 4 API @ 4 macOS @ 4 CUDA @ 4 GPU @ 4

Details

We are hiring senior engineers to work on the CUDA driver and runtime, core components of a platform for accelerating general purpose computation on the GPU. The team analyzes performance of applications, investigates bottlenecks in software or hardware, and delivers features and improvements to better realize the potential of NVIDIA hardware for workloads such as deep learning, scientific computation, self-driving cars, video games, and virtual reality.

CUDA defines a unified programming model across a range of system configurations and hardware capabilities. The CUDA driver interacts with GPU hardware, kernel mode drivers, and the operating system.

Responsibilities

Evangelize, architect, and implement new features for the CUDA driver and runtime.
Oversee and drive development efforts across multiple teams.
Analyze full-stack performance ranging from application level through libraries, system software, kernel software, and hardware.
Define forward-looking improvements to the CUDA APIs and programming model.
Create novel system software optimizations.
Write effective, maintainable, and well-tested code.
Develop code for multiple operating systems (Windows, Linux, macOS).
Investigate complex performance problems and deliver robust solutions that accelerate applications.

Requirements

BS or MS degree in Computer Science, Electrical Engineering, or equivalent experience.
7+ years of related development experience.
Strong C programming skills.
Experience working with large codebases.
Track record of debugging performance problems in complex environments with software and hardware components.
Experience with operating system interfaces for threads, process control, and virtual memory.
Experience writing and debugging multithreaded programs.
Strong collaborative and interpersonal skills; proven ability to guide and influence within a dynamic matrix environment. Good written communication.

Nice to have / Ways to stand out

Understanding of system-level architecture such as interconnects, memory hierarchy, interrupts, and memory-mapped I/O.
Experience with performance tuning of device drivers or low-level system software.
Experience with performance optimizations across a variety of CPU architectures (x86, POWER, ARM).
Knowledge of memory coherence and consistency models.
Experience with Windows, Linux, or macOS driver development.

Compensation & Benefits

Base salary ranges (depending on level and location):
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
You will also be eligible for equity and benefits. Exact base salary will be determined based on location, experience, and pay of employees in similar positions.

Additional information

Location: Santa Clara, CA, United States (hybrid) — #LI-Hybrid
Applications accepted at least until December 25, 2025.
NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.