Senior Deep Learning Software Engineer, Inference

at Nvidia

📍 Santa Clara, United States

$148,000-276,000 per year

SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Software Development @ 7 Python @ 1 Performance Optimization @ 4 LLM @ 4 Agile @ 1

Details

We are now looking for a Senior Deep Learning Software Engineer, Inference! NVIDIA is seeking an experienced Deep Learning Engineer focused on analyzing and improving performance of DL inference! NVIDIA is rapidly growing our research and development for Deep Learning Inference and is seeking excellent Software Engineers at all levels of expertise to join our team. Companies around the world are using NVIDIA GPUs to power a revolution in deep learning, enabling breakthroughs in areas like LLM, Generative AI, Recommenders and Vision that has put DL into every software solution. Join the team that builds the software to enable the performance optimization, deployment and serving of these DL solutions. We specialize in developing GPU-accelerated Deep learning software like TensorRT, DL benchmarking software and performant solutions to deploy and serve these models.

Responsibilities

  • Performance optimization, analysis, and tuning of DL models in various domains like LLM, Recommender, GNN, Generative AI.
  • Scale performance of DL models across different architectures and types of NVIDIA accelerators.
  • Contribute features and code to NVIDIA’s inference benchmarking frameworks, TensorRT, Triton and LLM software solutions.
  • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding to develop innovative solutions.

Requirements

  • Masters or PhD or equivalent experience in relevant field (Computer Engineering, Computer Science, EECS, AI).
  • At least 5 years of relevant software development experience.
  • You'll need excellent C/C++ programming and software design skills. SW Agile skills are helpful and Python experience is a plus.
  • Prior experience with training, deploying or optimizing the inference of DL models in production is a plus.
  • Prior background with performance modeling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU is a plus.
  • GPU programming experience (CUDA or OpenCL) is a plus.

Benefits

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer.