Senior Math Libraries Engineer, CPU and GPU Optimization

at Nvidia
USD 224,000-425,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 4 CI/CD @ 4 Hiring @ 4 Communication @ 7 Parallel Programming @ 4 Product Management @ 4 API @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 4

Details

NVIDIA is looking for an expert software engineer to help deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. You will join a team that designs, develops, and optimizes math libraries used in HPC and AI applications. The role focuses on modern, flexible APIs and high-performance kernels across CPU and GPU hardware, with attention to hybrid backends and deep integration with high-level language ecosystems (Python, NumPy, JAX, MLIR, etc.).

Responsibilities

  • Design modern, flexible, and easy-to-use APIs and kernels for math libraries and lead design reviews with collaborators.
  • Work closely with internal teams (Engineering, Product Management) and external partners (researchers, customers) to understand use cases and requirements.
  • Deliver timely math library releases to internal and external customers.
  • Continuously survey trends in software systems and become a domain expert.

Requirements

  • PhD or MSc in Computer Science, Applied Math, or a related science or engineering field preferred (or equivalent experience).
  • 12+ years of experience designing and developing software for high-performance computing and/or AI applications.
  • Advanced C++ skills, including modern design paradigms (e.g., template metaprogramming, RAII).
  • Parallel programming experience with CUDA, OpenCL, or vector programming on CPU (AVX, NEON or similar).
  • Experience with ARM, RISC-V and/or x86_64 CPU architectures.
  • Strong collaboration, communication, and documentation habits.

Ways to stand out

  • Strong background in numerical methods (e.g., FFT, numerical linear algebra).
  • Programming skills with Python and modern automation for building and testing (e.g., CMake, CI/CD, sanitizers).
  • Experience with cross-compilation and bringing code to new CPU/GPU/accelerator architectures.
  • Background with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS.
  • Experience with scientific and deep learning libraries/frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos.

Compensation and benefits

  • Base salary range:
    • Level 5: 224,000 USD - 356,500 USD
    • Level 6: 272,000 USD - 425,500 USD
  • You will also be eligible for equity and benefits.

Other details

  • Location: Santa Clara, California, United States
  • Employment type: Full time
  • Applications accepted at least until September 29, 2025.
  • NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.