Senior Software Engineer-Distributed Inference

at Nvidia
πŸ“ United States
USD 184,000-356,500 per year
SENIOR
βœ… Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 7 GCP @ 3 GitHub @ 4 Algorithms @ 4 Distributed Systems @ 4 TensorFlow @ 4 AWS @ 3 Azure @ 3 gRPC @ 3 Protobuf @ 3 Debugging @ 7 HTTP @ 3 JSON @ 3 OSS @ 4 LLM @ 4 PyTorch @ 4 Agile @ 4

Details

NVIDIA is a pioneer in computer graphics and accelerated computing, leveraging AI to drive the next era of computing with GPUs powering robots, self-driving cars, and more. The team focuses on developing deep learning software and user-facing tools for the Triton Inference Server, supporting breakthroughs in large language models, image classification, speech recognition, and natural language processing.

Responsibilities

  • Develop and enhance GenAI-Perf, Triton Performance Analyzer, and Triton Model Analyzer tools.
  • Collaborate with researchers and engineers to convert performance analysis requirements into actionable features.
  • Work closely with software engineers, system architects, and product managers to improve performance throughout the software lifecycle.
  • Set up, execute, and analyze the performance of LLM, Generative AI, and deep learning models.
  • Develop algorithms for measuring deep learning throughput, latency, benchmarking large language models, and model deployment.
  • Integrate tools into a unified, user-friendly platform for deep learning performance analysis.
  • Automate testing processes to ensure tool quality and stability.
  • Contribute to technical documentation and stay current with advancements in deep learning performance and LLM optimization.

Requirements

  • Bachelor's, Master's, or PhD degree or equivalent experience in Computer Science, computer architecture, or related field.
  • 8+ years experience.
  • Knowledge of distributed systems programming.
  • Experience in fast-paced, agile team environments.
  • Strong expertise in Python programming, software design, debugging, performance analysis, and test design.

Ways to Stand Out

  • Experience with deep learning algorithms, especially Large Language Models and frameworks like PyTorch, TensorFlow, TensorRT, and ONNX Runtime.
  • Strong troubleshooting skills covering storage systems, kernels, and containers.
  • Contributions to large open source projects including GitHub use, bug tracking, code branching/merging, and OSS licensing.
  • Familiarity with cloud platforms (AWS, Azure, GCP) and building/deploying cloud services using HTTP REST, gRPC, protobuf, and JSON.
  • Experience with NVIDIA GPUs and deep learning inference frameworks.

Benefits

  • Competitive base salary between 184,000 USD and 356,500 USD annually, based on location and experience.
  • Eligibility for equity and additional benefits.
  • Commitment to diversity and equal opportunity employer policies.