Senior Software Engineer-Distributed Inference
at Nvidia
π United States
USD 184,000-356,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 7 GCP @ 3 GitHub @ 4 Algorithms @ 4 Distributed Systems @ 4 TensorFlow @ 4 AWS @ 3 Azure @ 3 gRPC @ 3 Protobuf @ 3 Debugging @ 7 HTTP @ 3 JSON @ 3 OSS @ 4 LLM @ 4 PyTorch @ 4 Agile @ 4Details
NVIDIA is a pioneer in computer graphics and accelerated computing, leveraging AI to drive the next era of computing with GPUs powering robots, self-driving cars, and more. The team focuses on developing deep learning software and user-facing tools for the Triton Inference Server, supporting breakthroughs in large language models, image classification, speech recognition, and natural language processing.
Responsibilities
- Develop and enhance GenAI-Perf, Triton Performance Analyzer, and Triton Model Analyzer tools.
- Collaborate with researchers and engineers to convert performance analysis requirements into actionable features.
- Work closely with software engineers, system architects, and product managers to improve performance throughout the software lifecycle.
- Set up, execute, and analyze the performance of LLM, Generative AI, and deep learning models.
- Develop algorithms for measuring deep learning throughput, latency, benchmarking large language models, and model deployment.
- Integrate tools into a unified, user-friendly platform for deep learning performance analysis.
- Automate testing processes to ensure tool quality and stability.
- Contribute to technical documentation and stay current with advancements in deep learning performance and LLM optimization.
Requirements
- Bachelor's, Master's, or PhD degree or equivalent experience in Computer Science, computer architecture, or related field.
- 8+ years experience.
- Knowledge of distributed systems programming.
- Experience in fast-paced, agile team environments.
- Strong expertise in Python programming, software design, debugging, performance analysis, and test design.
Ways to Stand Out
- Experience with deep learning algorithms, especially Large Language Models and frameworks like PyTorch, TensorFlow, TensorRT, and ONNX Runtime.
- Strong troubleshooting skills covering storage systems, kernels, and containers.
- Contributions to large open source projects including GitHub use, bug tracking, code branching/merging, and OSS licensing.
- Familiarity with cloud platforms (AWS, Azure, GCP) and building/deploying cloud services using HTTP REST, gRPC, protobuf, and JSON.
- Experience with NVIDIA GPUs and deep learning inference frameworks.
Benefits
- Competitive base salary between 184,000 USD and 356,500 USD annually, based on location and experience.
- Eligibility for equity and additional benefits.
- Commitment to diversity and equal opportunity employer policies.