Senior Math Libraries Engineer, CPU and GPU Optimization
at Nvidia
USD 224,000-425,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 4 CI/CD @ 4 Hiring @ 4 Communication @ 7 Parallel Programming @ 4 Product Management @ 4 API @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 4Details
NVIDIA is looking for an expert software engineer to help deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. You will join a team that designs, develops, and optimizes math libraries used in HPC and AI applications. The role focuses on modern, flexible APIs and high-performance kernels across CPU and GPU hardware, with attention to hybrid backends and deep integration with high-level language ecosystems (Python, NumPy, JAX, MLIR, etc.).
Responsibilities
- Design modern, flexible, and easy-to-use APIs and kernels for math libraries and lead design reviews with collaborators.
- Work closely with internal teams (Engineering, Product Management) and external partners (researchers, customers) to understand use cases and requirements.
- Deliver timely math library releases to internal and external customers.
- Continuously survey trends in software systems and become a domain expert.
Requirements
- PhD or MSc in Computer Science, Applied Math, or a related science or engineering field preferred (or equivalent experience).
- 12+ years of experience designing and developing software for high-performance computing and/or AI applications.
- Advanced C++ skills, including modern design paradigms (e.g., template metaprogramming, RAII).
- Parallel programming experience with CUDA, OpenCL, or vector programming on CPU (AVX, NEON or similar).
- Experience with ARM, RISC-V and/or x86_64 CPU architectures.
- Strong collaboration, communication, and documentation habits.
Ways to stand out
- Strong background in numerical methods (e.g., FFT, numerical linear algebra).
- Programming skills with Python and modern automation for building and testing (e.g., CMake, CI/CD, sanitizers).
- Experience with cross-compilation and bringing code to new CPU/GPU/accelerator architectures.
- Background with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS.
- Experience with scientific and deep learning libraries/frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos.
Compensation and benefits
- Base salary range:
- Level 5: 224,000 USD - 356,500 USD
- Level 6: 272,000 USD - 425,500 USD
- You will also be eligible for equity and benefits.
Other details
- Location: Santa Clara, California, United States
- Employment type: Full time
- Applications accepted at least until September 29, 2025.
- NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.