Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026
at Nvidia
USD 124,000-241,500 per year
Used Tools & Technologies
GenAIRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Software Development @ 3
Python @ 6
TensorFlow @ 3
Performance Optimization @ 3
OSS @ 3
LLM @ 3
PyTorch @ 3
CUDA @ 5
GPU @ 3
Deep Learning @ 3
Generative AI @ 3
AI @ 3
Robotics @ 3
vLLM @ 3
TensorRT @ 3
SGLang @ 3
Performance Analysis @ 3
JAX @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
We are now looking for a Deep Learning Software Engineer, TensorRT Performance. NVIDIA is seeking an experienced Deep Learning Engineer passionate about analyzing and improving the performance of NVIDIA’s inference ecosystem. The team develops GPU-accelerated deep learning inference software such as TensorRT, DL benchmarking software and performant solutions to deploy and serve models across datacenter GPUs and edge SoCs.
Responsibilities
- Establish performance benchmarking methodologies and analysis workflows and identify performance issues and opportunities for NVIDIA’s inference ecosystem (e.g., TensorRT, TensorRT-EdgeLLM, Torch-TensorRT).
- Contribute features and code to NVIDIA/OSS inference frameworks including but not limited to TensorRT, TensorRT-EdgeLLM, and Torch-TensorRT.
- Develop model pipelines for NVIDIA’s inference ecosystem focused on optimized performance in areas such as quantization, scheduling, memory management, and distributed inference.
- Collaborate with cross-functional teams across generative AI, automotive, robotics, image understanding, and speech understanding to set directions and develop inference solutions.
- Scale performance of deep learning models across different architectures and types of NVIDIA accelerators.
Requirements
- Bachelors, Masters, PhD, or equivalent experience in Computer Science, Computer Engineering, EECS, AI, or relevant field.
- 2 years of relevant software development experience.
- Strong C++ and Python programming and software engineering skills.
- Experience with deep learning frameworks (examples: PyTorch, JAX, TensorFlow, ONNX) and inference libraries (examples: TensorRT, TensorRT-LLM, vLLM, SGLang, FlashInfer).
- Experience with performance analysis and performance optimization.
Ways to stand out
- Strong foundation and architectural knowledge of GPUs.
- Deep understanding of modern deep learning models and workloads (e.g., Transformers, Recommenders, ASR, TTS, Visual Understanding).
- Proficiency in one of the deep learning programming domain specific languages (examples: CUDA, TileIR, CuTeDSL, cutlass, Triton).
- Prior contributions to major LLM inference frameworks (e.g., vLLM) or experience with graph compilers in deep learning inference (e.g., TorchDynamo, TorchInductor).
- Prior experience optimizing performance for low-latency, resource-constrained systems or embedded AI pipelines (e.g., Jetson systems or other edge AI accelerators).
Compensation & Benefits
- Base salary ranges (determined by location, experience, and pay of employees in similar positions):
- Level 2: 124,000 USD - 195,500 USD
- Level 3: 152,000 USD - 241,500 USD
- You will also be eligible for equity and benefits.
Additional information
- Applications for this job will be accepted at least until April 7, 2026.
- This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.