Senior Accelerated Computing Architect

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ On-site

Used Tools & Technologies

HPC

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Python @ 3 Algorithms @ 4 Data Structures @ 4 Machine Learning @ 4 Communication @ 3 Prioritization @ 4 Performance Optimization @ 4 API @ 3 CUDA @ 4 GPU @ 4 AI @ 4 OpenCL @ 4

Details

NVIDIA is developing software and system architectures for accelerated high performance computing, scientific computing, machine learning, AI, datacenter, and automotive computing. This position offers the opportunity to make a meaningful impact in a fast-moving, technology-focused company.

Responsibilities

Perform in-depth analysis and optimization to ensure the best possible performance on current and/or next-generation NVIDIA GPUs.
Create and optimize core parallel algorithms, data structures, and reference codes to provide the best possible solutions for NVIDIA GPUs.
Understand and analyze the interplay of hardware and software architectures on core algorithms, programming models, and applications.
Actively collaborate with hardware design, software engineering, product, and research teams to guide the direction of accelerated computing.
Dive into accelerated computing applications to facilitate software-hardware co-design.
Write up and present work via white papers, conference publications, official blog posts, patent applications, etc., as appropriate.

Requirements

MS or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, or equivalent experience.
6+ years of relevant work experience.
Strong mathematical fundamentals, including linear algebra and numerical methods.
A passion for performance optimization.
Hands-on experience with the massively parallel GPU programming model (e.g., CUDA or OpenCL). Familiarity with APIs for multi-node communication like MPI or OpenSHMEM/NVSHMEM is a plus.
Strong knowledge of C and C++ with solid understanding of software design, programming techniques, and algorithms. Familiarity with threading APIs for multicore CPUs and Unix-style Inter-process Communication (IPC) APIs is a plus.
Familiarity with Python is a plus.
Good communication and organization skills, with a logical approach to problem solving, good time management, and task prioritization skills.

Compensation

Base salary ranges provided in the posting:
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
You will also be eligible for equity and benefits.

Additional information

Applications for this job will be accepted at least until April 18, 2026. This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer.