Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Python @ 6
Algorithms @ 6
LLM @ 3
CUDA @ 4
GPU @ 4
Deep Learning @ 4
AI @ 4
Reinforcement Learning @ 4
Performance Analysis @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Join our ADLR – Efficiency team to make deep learning faster and consume less energy. The team influences next-generation hardware, works on the Nemotron series of models to make state-of-the-art deep learning models the most efficient open-source models, and develops new technologies, software and algorithms to optimize neural networks for training and deployment. Topics include quantization, sparsity, optimizers, reinforcement learning, efficient architectures and pre-training. The team is inside the Nemotron pre-training group and collaborates across the company to make NVIDIA GPUs the most efficient AI platform possible.
Responsibilities
- Research low-bit number representations and pruning and their effect on neural network inference and training accuracy, including requirements by existing state-of-the-art networks and co-design of future architectures and optimizers.
- Innovate new algorithms to make deep learning more efficient while retaining accuracy, and open-source or publish these algorithms.
- Run large-scale deep learning experiments to validate ideas and analyze the effects of efficiency improvements.
- Collaborate across the company with teams building hardware, software and deep learning architectures.
Requirements
- PhD in AI, computer science, computer engineering, math or a related field — or equivalent experience.
- 5+ years of relevant industrial research experience.
- Familiarity with state-of-the-art neural network architectures, optimizers and LLM training.
- Experience with modern deep learning training frameworks and/or inference engines.
- Fluency in Python and solid coding/software-engineering practices.
- Proven track record in publications and/or ability to run large-scale experiments.
- Strong interest in neural network efficiency.
Ways to stand out:
- Experience in quantization, pruning, numerics and efficient architectures.
- Background in computer architecture.
- Experience with GPU computing, kernels, CUDA programming and/or performance analysis.
Benefits
- Base salary ranges (location-, experience- and level-dependent):
- Level 4: 192,000 USD - 304,750 USD per year
- Level 5: 224,000 USD - 356,500 USD per year
- Eligible for equity and company benefits.
- Applications accepted at least until February 8, 2026.