Vacancy is archived. Applications are no longer accepted.

Senior Software Engineer-Distributed Inference

at Nvidia

📍 United States

SENIOR

✅ Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 7 GCP @ 3 GitHub @ 4 Algorithms @ 4 Distributed Systems @ 4 TensorFlow @ 4 AWS @ 3 Azure @ 3 gRPC @ 3 Protobuf @ 3 Debugging @ 7 HTTP @ 3 JSON @ 3 OSS @ 4 LLM @ 4 PyTorch @ 4 Agile @ 4

Details

NVIDIA is a pioneer in computer graphics and accelerated computing, leveraging AI to drive the next era of computing with GPUs powering robots, self-driving cars, and more. The team focuses on developing deep learning software and user-facing tools for the Triton Inference Server, supporting breakthroughs in large language models, image classification, speech recognition, and natural language processing.

Responsibilities

Develop and enhance GenAI-Perf, Triton Performance Analyzer, and Triton Model Analyzer tools.
Collaborate with researchers and engineers to convert performance analysis requirements into actionable features.
Work closely with software engineers, system architects, and product managers to improve performance throughout the software lifecycle.
Set up, execute, and analyze the performance of LLM, Generative AI, and deep learning models.
Develop algorithms for measuring deep learning throughput, latency, benchmarking large language models, and model deployment.
Integrate tools into a unified, user-friendly platform for deep learning performance analysis.
Automate testing processes to ensure tool quality and stability.
Contribute to technical documentation and stay current with advancements in deep learning performance and LLM optimization.

Requirements

Bachelor's, Master's, or PhD degree or equivalent experience in Computer Science, computer architecture, or related field.
8+ years experience.
Knowledge of distributed systems programming.
Experience in fast-paced, agile team environments.
Strong expertise in Python programming, software design, debugging, performance analysis, and test design.

Ways to Stand Out

Experience with deep learning algorithms, especially Large Language Models and frameworks like PyTorch, TensorFlow, TensorRT, and ONNX Runtime.
Strong troubleshooting skills covering storage systems, kernels, and containers.
Contributions to large open source projects including GitHub use, bug tracking, code branching/merging, and OSS licensing.
Familiarity with cloud platforms (AWS, Azure, GCP) and building/deploying cloud services using HTTP REST, gRPC, protobuf, and JSON.
Experience with NVIDIA GPUs and deep learning inference frameworks.

Benefits

Competitive base salary between 184,000 USD and 356,500 USD annually, based on location and experience.
Eligibility for equity and additional benefits.
Commitment to diversity and equal opportunity employer policies.