Used Tools & Technologies
Not specified
Required Skills & Competences ?
Algorithms @ 4 TensorFlow @ 4 Mentoring @ 1 Debugging @ 4 PyTorch @ 4 CUDA @ 4 GPU @ 4Details
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Today NVIDIA is focused on AI and building high-performance, production-grade software that powers next-generation AI systems. You will join the Deep Learning Compiler team to develop compiler optimization algorithms and tooling to accelerate inference and training on NVIDIA GPUs at scale.
Responsibilities
- Develop and implement compiler optimization techniques for deep learning network graphs.
- Design novel graph partitioning and tensor sharding techniques for distributed training and inference.
- Perform performance tuning and analysis for deep learning workloads.
- Implement code-generation for NVIDIA GPU backends using open-source compilers such as MLIR, LLVM, and OpenAI Triton.
- Design user-facing features in JAX and related libraries and perform general software engineering work.
- Collaborate closely with deep learning framework teams and GPU hardware architecture teams to accelerate next-generation AI software and hardware co-design.
Requirements
- Bachelors, Masters, or Ph.D. in Computer Science, Computer Engineering, or a related field, or equivalent experience.
- 4+ years of relevant work or research experience in performance analysis and compiler optimizations.
- Ability to work independently, define project goals and scope, and lead development efforts while adopting clean software engineering and testing practices.
- Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.
- Strong foundation in CPU, GPU, or other high-performance hardware accelerator architectures; knowledge of high-performance computing and distributed programming.
- CUDA or OpenCL programming experience is desired but not required.
- Experience with technologies such as XLA, TVM, MLIR, LLVM, OpenAI Triton, deep learning models and algorithms, and deep learning framework design is a large plus.
- Strong interpersonal skills and ability to work in a dynamic product-oriented team. Mentoring junior engineers/interns is a bonus.
Preferred / Ways to stand out
- Experience working with deep learning frameworks such as JAX, PyTorch, or TensorFlow.
- Extensive experience with CUDA or GPUs in general.
- Experience with open-source compilers such as XLA, LLVM, MLIR, or TVM.
Benefits and Additional Information
- Base salary ranges provided: 148000 USD - 235750 USD for Level 3, and 184000 USD - 287500 USD for Level 4 (final base salary will be determined based on location, experience, and pay of employees in similar positions).
- Eligibility for equity and a comprehensive benefits package.
- Applications for this job will be accepted at least until July 29, 2025.
- NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.