Senior Deep Learning Architect, LLM Inference

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Marketing @ 4 Python @ 3 Leadership @ 4 Communication @ 4 LLM @ 4 PyTorch @ 4 CUDA @ 3 GPU @ 4

Details

NVIDIA is at the forefront of the generative AI revolution. The Inference Benchmarking (IB) team focuses on advanced inference server performance for Large Language Models (LLMs).

Responsibilities

  • Characterize the latest LLMs and inference servers like vLLM and SGLang to ensure TRT-LLM maintains leadership.
  • Collaborate with the performance marketing team to build content such as blog posts highlighting TRT-LLM's achievements.
  • Work with engineers from AI startups to debug and standardize methodologies.
  • Profile GPU kernel-level performance to identify hardware and software optimization opportunities.
  • Develop profiling and analysis software tools that adapt to rapid network scaling.
  • Contribute to deep learning software projects including PyTorch, TRT-LLM, vLLM, and SGLang.
  • Verify performance of TRT-LLM for new GPU product launches.
  • Collaborate cross-company with software, research, and product teams to guide inference serving direction.

Requirements

  • Master's or PhD degree in Computer Science, Computer Engineering, or related fields, or equivalent experience.
  • 6+ years of relevant industry experience.
  • Detailed knowledge of deep learning inference serving, PyTorch programming, profiling, and compiler optimizations.
  • Proficiency in Python and C++ and familiarity with CUDA.
  • Experience with LLMs and their performance challenges and opportunities.
  • Solid understanding of CPU and GPU microarchitecture and performance.
  • Experience with complex software projects such as frameworks, compilers, or operating systems.
  • Good written and verbal communication skills; able to work independently and collaboratively.

Ways to stand out

  • Demonstrate continuous improvement in software and hardware performance.
  • Showcase novel use cases for agentic AI tools.
  • Experience with database and visualization tools like D3.js.

Benefits

  • Eligibility for equity and other benefits.
  • Work in a highly motivated, forward-thinking, and skilled team at NVIDIA.

Salary range: $184,000 - $356,500 USD annually. Salary is based on location, experience, and peer pay.

NVIDIA is an equal opportunity employer committed to diversity and inclusion.