Used Tools & Technologies
GenAIRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Software Development @ 6
Python @ 1
Performance Optimization @ 4
Debugging @ 4
LLM @ 4
PyTorch @ 4
Agile @ 4
CUDA @ 4
GPU @ 4
Deep Learning @ 4
Generative AI @ 4
AI @ 4
Profiling @ 4
vLLM @ 4
NCCL @ 4
SGLang @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference. You will help design, build, and optimize GPU-accelerated software that powers advanced AI applications, improving frameworks and model serving pipelines for large language models and generative AI across NVIDIA accelerators.
Responsibilities
- Performance optimization, analysis, and tuning of deep learning models in domains such as LLM, multimodal, and generative AI.
- Scale performance of DL models across different architectures and NVIDIA accelerators (datacenter GPUs to edge SoCs).
- Contribute features and code to inference libraries and frameworks including vLLM, SGLang, FlashInfer and other LLM software solutions.
- Collaborate with cross-functional teams across frameworks, NVIDIA libraries, and inference optimization projects.
- Implement and optimize model serving pipelines using open-source tools and plugins (CUTLASS, OAI Triton, NCCL, CUDA kernels).
Requirements
- Masters, PhD, or equivalent experience in Computer Engineering, Computer Science, EECS, AI, or a related field.
- 5+ years of relevant software development experience.
- Excellent C/C++ programming and software design skills. Agile software development experience is helpful.
- Python experience is a plus.
- Prior experience training, deploying, or optimizing inference of deep learning models in production is a plus.
- Background in performance modeling, profiling, debugging, code optimization, or CPU/GPU architecture is a plus.
Ways to stand out
- Contributions to deep learning software projects (PyTorch, vLLM, SGLang).
- Experience with multi-GPU communications (NCCL, NVSHMEM).
- Experience building and shipping products to enterprise customers.
- GPU programming experience (CUDA, OAI Triton, CUTLASS).
Compensation
- For Poland: Level 3 base salary range: 221250 PLN - 383500 PLN.
- For Poland: Level 4 base salary range: 292500 PLN - 507000 PLN.
About the company / Benefits
- NVIDIA offers highly competitive salaries, an extensive benefits package, and a work environment promoting diversity, inclusion, and flexibility. NVIDIA is an equal opportunity employer committed to fostering a supportive workplace.