Senior Deep Learning Compiler Engineer - XLA

at Nvidia

📍 Santa Clara, United States

USD 148,000-287,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Algorithms @ 4 TensorFlow @ 4 Mentoring @ 1 Debugging @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 4

Details

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Today NVIDIA is focused on AI and building high-performance, production-grade software that powers next-generation AI systems. You will join the Deep Learning Compiler team to develop compiler optimization algorithms and tooling to accelerate inference and training on NVIDIA GPUs at scale.

Responsibilities

Develop and implement compiler optimization techniques for deep learning network graphs.
Design novel graph partitioning and tensor sharding techniques for distributed training and inference.
Perform performance tuning and analysis for deep learning workloads.
Implement code-generation for NVIDIA GPU backends using open-source compilers such as MLIR, LLVM, and OpenAI Triton.
Design user-facing features in JAX and related libraries and perform general software engineering work.
Collaborate closely with deep learning framework teams and GPU hardware architecture teams to accelerate next-generation AI software and hardware co-design.

Requirements

Bachelors, Masters, or Ph.D. in Computer Science, Computer Engineering, or a related field, or equivalent experience.
4+ years of relevant work or research experience in performance analysis and compiler optimizations.
Ability to work independently, define project goals and scope, and lead development efforts while adopting clean software engineering and testing practices.
Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.
Strong foundation in CPU, GPU, or other high-performance hardware accelerator architectures; knowledge of high-performance computing and distributed programming.
CUDA or OpenCL programming experience is desired but not required.
Experience with technologies such as XLA, TVM, MLIR, LLVM, OpenAI Triton, deep learning models and algorithms, and deep learning framework design is a large plus.
Strong interpersonal skills and ability to work in a dynamic product-oriented team. Mentoring junior engineers/interns is a bonus.

Preferred / Ways to stand out

Experience working with deep learning frameworks such as JAX, PyTorch, or TensorFlow.
Extensive experience with CUDA or GPUs in general.
Experience with open-source compilers such as XLA, LLVM, MLIR, or TVM.

Benefits and Additional Information

Base salary ranges provided: 148000 USD - 235750 USD for Level 3, and 184000 USD - 287500 USD for Level 4 (final base salary will be determined based on location, experience, and pay of employees in similar positions).
Eligibility for equity and a comprehensive benefits package.
Applications for this job will be accepted at least until July 29, 2025.
NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.