Used Tools & Technologies
LLMRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Software Development @ 6
DevOps @ 4
Python @ 4
C @ 4
C++ @ 7
Algorithms @ 4
Data Structures @ 4
Machine Learning @ 7
TensorFlow @ 4
Communication @ 4
PyTorch @ 4
CUDA @ 6
GPU @ 4
Deep Learning @ 4
AI @ 4
Profiling @ 4
OpenCL @ 6
TensorRT @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
We are now looking for a Senior Software Engineer for Deep Learning Inference. Help build a state-of-the-art inference framework for accelerating Deep Learning models, especially Large Language Models, on NVIDIA GPUs. The position is within the Deep Learning Inference TensorRT software team.
Responsibilities
- Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance.
- Develop components of TensorRT, NVIDIA's SDK for high-performance deep learning inference.
- Closely follow academic developments in the field of artificial intelligence and feature-update TensorRT.
- Use C++ and Python to build graph parsers, optimizers, and tools for effective deployment of trained deep learning models.
- Collaborate with teams of deep learning experts, GPU architects and DevOps engineers across diverse teams.
Requirements
- Bachelor's, Master's, PhD or equivalent experience in Computer Science, Computer Engineering, Electrical Engineering or related field.
- 3+ years of software development experience.
- Strong experience with modern C++ standards (C++11/C++14/C++17/C++20, etc.).
- Strong grasp of Machine Learning concepts.
- Experience and knowledge in Computer Architecture, Data Structures, Algorithms.
- Excellent communication skills, and an aptitude for collaboration and teamwork.
Ways to stand out
- Experience developing system software.
- Proficiency in Python and background in GPU kernel programming using CUDA or OpenCL.
- Experience in software performance benchmarking, profiling, and optimizations.
- Background in compiler development.
- Experience working with TensorRT, PyTorch, TensorFlow, ONNX Runtime or other ML frameworks.
Compensation & Benefits
- Base salary ranges (by level):
- Level 3: 152,000 USD - 241,500 USD
- Level 4: 184,000 USD - 287,500 USD
- You will also be eligible for equity and benefits (link provided in original posting).
Additional information
- #LI-Hybrid
- Applications for this job will be accepted at least until March 21, 2026.
- This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.