Senior Systems Software Engineer, CUDA Driver - Multi-Node and Memory Model
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Software Development @ 4
Linux @ 4
Hiring @ 4
Communication @ 4
Debugging @ 4
API @ 4
PyTorch @ 4
CUDA @ 4
GPU @ 4
Deep Learning @ 7
AI @ 7
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. NVIDIA is growing its teams and is hiring systems software engineers to work on core platform components for accelerating general-purpose computation on the GPU.
Responsibilities
- Evangelize, architect, and implement new features related to CUDA’s memory model and multi-node scalability geared towards next-gen AI applications and deployments.
- Coordinate and drive development efforts across multiple teams.
- Help define forward-looking improvements to the CUDA APIs and programming model.
- Write effective, maintainable, and well-tested code.
- Develop code for multiple operating systems.
Requirements
- BS or MS degree in Computer Science, Electrical Engineering or related field (or equivalent experience).
- Strong C and C++ programming skills.
- Minimum of 8 years of related development experience (multiple positions for varying experience levels open).
- Experience driving projects across multiple teams.
- Experience working with large codebases.
- Background with operating system interfaces for threads, process control, and virtual memory.
- Experience writing and debugging multithreaded programs.
- Good written communication as well as presentation skills.
Preferred / Ways to stand out
- Prior experience with parallel computing, PyTorch, low-latency AI inference.
- Understanding of system-level architecture, such as interconnects, memory hierarchy, interrupts, and memory-mapped I/O.
- Knowledge of memory coherence and consistency models.
- Background with kernel mode development.
- Experience with Linux, or Windows systems software development.
Compensation & Benefits
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits.
Additional details
- Applications for this job will be accepted at least until April 18, 2026.
- This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.