Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 4 CI/CD @ 4 Communication @ 7 Parallel Programming @ 7 Performance Optimization @ 4 Jira @ 4 Product Management @ 4 Debugging @ 7 Project Management @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 4Details
We are looking for software engineers to contribute to the design and development of libraries and tools to simplify and accelerate computing for unstructured sparsity in deep learning (DL) and high-performance computing (HPC). The team develops GPU-accelerated libraries and SDKs used by commercial and academic organizations for LLMs, computer aided engineering, quantum chemistry, autonomous vehicles, computer vision, data analytics and scientific simulations.
Responsibilities
- Design and develop a C++-based system to simplify and accelerate computing for unstructured sparsity in DL and HPC on NVIDIA GPUs.
- Enable the system in languages and frameworks commonly used in DL, such as Python and PyTorch.
- Implement domain-specific language (DSL) specifications of sparse storage formats and on-demand code generation for sparse tensor computations.
- Evaluate and improve system performance on real-life applications and workloads.
- Write effective, well-tested production code to improve library quality, performance, and maintainability.
- Work closely with product management and internal/external partners to understand feature and performance requirements and contribute to technical roadmaps.
Requirements
- BS, MS, or PhD in Computer Science, Applied Math, or a related field (or equivalent experience).
- 6+ years of experience developing, debugging, and optimizing high-performance software using C++ and parallel programming techniques, ideally for sparse linear algebra applications.
- Experience with CUDA and parallel programming technologies such as MPI and OpenMP (or equivalents).
- Experience with domain-specific language design and compiler optimizations; experience with sparse compilers (MLIR or TACO) is required.
- Excellent C++, Python, and CUDA programming skills.
- Strong collaboration, communication, and documentation habits; ideally experience working in a globally distributed organization.
Preferred / Ways to stand out
- Strong understanding of sparse computations, particularly sparsity in AI and HPC.
- Good understanding of LLMs, deep learning methods, and frameworks.
- Experience with low-level GPU performance optimization and numerical linear algebra methods (direct and iterative solvers).
- Experience with CI/CD systems and project management tools such as JIRA.
Compensation & Benefits
- Base salary ranges (determined by location, experience, and internal pay equity):
- Level 4: 184,000 USD - 287,500 USD per year
- Level 5: 224,000 USD - 356,500 USD per year
- Eligible for equity and company benefits (see NVIDIA benefits).
Additional information
- Applications accepted at least until August 22, 2025.
- NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.