Senior Applied Deep Learning Research Scientist, Efficiency

at Nvidia

📍 Santa Clara, United States

USD 192,000-356,500 per year

SENIOR

✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Python @ 6 Algorithms @ 6 LLM @ 3 CUDA @ 4 GPU @ 4 Deep Learning @ 4 AI @ 4 Reinforcement Learning @ 4 Performance Analysis @ 4

Details

Join our ADLR – Efficiency team to make deep learning faster and consume less energy. The team influences next-generation hardware, works on the Nemotron series of models to make state-of-the-art deep learning models the most efficient open-source models, and develops new technologies, software and algorithms to optimize neural networks for training and deployment. Topics include quantization, sparsity, optimizers, reinforcement learning, efficient architectures and pre-training. The team is inside the Nemotron pre-training group and collaborates across the company to make NVIDIA GPUs the most efficient AI platform possible.

Responsibilities

Research low-bit number representations and pruning and their effect on neural network inference and training accuracy, including requirements by existing state-of-the-art networks and co-design of future architectures and optimizers.
Innovate new algorithms to make deep learning more efficient while retaining accuracy, and open-source or publish these algorithms.
Run large-scale deep learning experiments to validate ideas and analyze the effects of efficiency improvements.
Collaborate across the company with teams building hardware, software and deep learning architectures.

Requirements

PhD in AI, computer science, computer engineering, math or a related field — or equivalent experience.
5+ years of relevant industrial research experience.
Familiarity with state-of-the-art neural network architectures, optimizers and LLM training.
Experience with modern deep learning training frameworks and/or inference engines.
Fluency in Python and solid coding/software-engineering practices.
Proven track record in publications and/or ability to run large-scale experiments.
Strong interest in neural network efficiency.

Ways to stand out:

Experience in quantization, pruning, numerics and efficient architectures.
Background in computer architecture.
Experience with GPU computing, kernels, CUDA programming and/or performance analysis.

Benefits

Base salary ranges (location-, experience- and level-dependent):
- Level 4: 192,000 USD - 304,750 USD per year
- Level 5: 224,000 USD - 356,500 USD per year
Eligible for equity and company benefits.
Applications accepted at least until February 8, 2026.