Senior Software Engineer - Parallel Computing Systems

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

GitHub @ 4 Algorithms @ 4 Distributed Systems @ 4 Communication @ 4 Parallel Programming @ 4 Performance Optimization @ 4 CUDA @ 4 GPU @ 4

Details

Do you have expertise in CUDA kernel optimization, C++ systems programming, or compiler infrastructure? Join NVIDIA's nvFuser team to build the next-generation fusion compiler that automatically optimizes deep learning models for workloads scaling to thousands of GPUs! We're looking for engineers who excel at parallel programming and systems-level performance work and want to directly impact the future of AI compilation.

Responsibilities

  • Design algorithms that generate highly optimized code from deep learning programs.
  • Build GPU-aware CPU runtime systems coordinating kernel execution for maximum performance.
  • Collaborate with hardware engineers, framework maintainers, and optimization experts to create advanced compiler infrastructure.
  • Debug performance bottlenecks in thousand-GPU distributed systems.
  • Influence next-generation hardware design by developing innovative AI workload optimization techniques.

Requirements

  • MS or PhD in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
  • 4+ years advanced C++ programming with large codebase development, template meta-programming, and performance-critical code.
  • Strong parallel programming experience with multi-threading, OpenMP, CUDA, MPI, NCCL, NVSHMEM, or other parallel computing technologies.
  • Experience with low-level performance optimization and systematic bottleneck identification beyond basic profiling.
  • Performance analysis skills for high-level program bottleneck identification and optimization strategy development.
  • Collaborative problem-solving approach with adaptability in ambiguous situations, first-principles based thinking, and sense of ownership.
  • Excellent verbal and written communication skills.

Ways to Stand Out

  • Experience with HPC/Scientific Computing: CUDA optimization, GPU programming, numerical libraries (cuBLAS, NCCL), or distributed computing.
  • Compiler engineering background: LLVM, GCC, domain-specific language design, program analysis, or IR transformations and optimization passes.
  • Deep technical foundation in CPU/GPU architectures, numeric libraries, modular software design, or runtime systems.
  • Experience with large software projects, performance profiling, and rapid learning.
  • Expertise with distributed parallelism techniques, tensor operations, auto-tuning, or performance modeling.

Benefits

  • Base salary range: 184,000 USD - 356,500 USD (determined based on location, experience, and comparable employee pay).
  • Eligibility for equity and benefits. See NVIDIA benefits.
  • NVIDIA is an equal opportunity employer committed to diversity and non-discrimination.