Senior Math Libraries Engineer, CPU and GPU Optimization

at Nvidia

📍 Santa Clara, United States

USD 224,000-425,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 3 CI/CD @ 4 Communication @ 7 Parallel Programming @ 4 Product Management @ 4 API @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 4

Details

NVIDIA is looking for an expert software engineer to help deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. You will join a team designing, developing, and optimizing math libraries used across HPC and AI applications, with a focus on performance, portability, and modern APIs.

Responsibilities

Design modern, flexible, and easy-to-use APIs and kernels for math libraries and lead design reviews with collaborators.
Work closely with internal teams (Engineering, Product Management) and external partners (researchers, customers) to understand use cases and requirements.
Deliver timely math library releases to internal and external customers.
Continuously survey trends in software systems to become a domain expert and guide the direction of the libraries.

Requirements

PhD or MSc degree in Computer Science, Applied Math, or a related science or engineering field preferred (or equivalent experience).
12+ years of experience designing and developing software for high-performance computing and/or AI applications.
Advanced C++ skills, including modern design paradigms (e.g., template metaprogramming, RAII).
Parallel programming experience with CUDA, OpenCL, or vector programming on CPU (AVX, NEON or similar).
Experience with CPU architectures such as ARM, RISC-V and/or x86_64.
Strong collaboration, communication, and documentation habits.

Preferred / Ways to Stand Out

Strong background in numerical methods (e.g., FFT, numerical linear algebra).
Programming skills with Python and familiarity with high-level ecosystems (e.g., NumPy, JAX, MLIR).
Modern automation for building and testing (e.g., CMake, CI/CD, sanitizers).
Experience with cross-compilation and setting up CPU/GPU/accelerator cross-compilation toolchains.
Background with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS.
Experience with scientific and deep learning libraries and frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos.

Compensation and Benefits

Base salary range (determined by location, experience, and internal pay):
- Level 5: 224,000 USD - 356,500 USD
- Level 6: 272,000 USD - 425,500 USD
You will also be eligible for equity and benefits.

Other Information

Applications for this job will be accepted at least until September 29, 2025.
NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.