Senior Math Libraries Engineer, CPU and GPU Optimization
at Nvidia
USD 224,000-425,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 3 CI/CD @ 4 Communication @ 7 Parallel Programming @ 4 Product Management @ 4 API @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 4Details
NVIDIA is looking for an expert software engineer to help deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. You will join a team designing, developing, and optimizing math libraries used across HPC and AI applications, with a focus on performance, portability, and modern APIs.
Responsibilities
- Design modern, flexible, and easy-to-use APIs and kernels for math libraries and lead design reviews with collaborators.
- Work closely with internal teams (Engineering, Product Management) and external partners (researchers, customers) to understand use cases and requirements.
- Deliver timely math library releases to internal and external customers.
- Continuously survey trends in software systems to become a domain expert and guide the direction of the libraries.
Requirements
- PhD or MSc degree in Computer Science, Applied Math, or a related science or engineering field preferred (or equivalent experience).
- 12+ years of experience designing and developing software for high-performance computing and/or AI applications.
- Advanced C++ skills, including modern design paradigms (e.g., template metaprogramming, RAII).
- Parallel programming experience with CUDA, OpenCL, or vector programming on CPU (AVX, NEON or similar).
- Experience with CPU architectures such as ARM, RISC-V and/or x86_64.
- Strong collaboration, communication, and documentation habits.
Preferred / Ways to Stand Out
- Strong background in numerical methods (e.g., FFT, numerical linear algebra).
- Programming skills with Python and familiarity with high-level ecosystems (e.g., NumPy, JAX, MLIR).
- Modern automation for building and testing (e.g., CMake, CI/CD, sanitizers).
- Experience with cross-compilation and setting up CPU/GPU/accelerator cross-compilation toolchains.
- Background with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS.
- Experience with scientific and deep learning libraries and frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos.
Compensation and Benefits
- Base salary range (determined by location, experience, and internal pay):
- Level 5: 224,000 USD - 356,500 USD
- Level 6: 272,000 USD - 425,500 USD
- You will also be eligible for equity and benefits.
Other Information
- Applications for this job will be accepted at least until September 29, 2025.
- NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.