Used Tools & Technologies
Not specified
Required Skills & Competences ?
Software Development @ 3 Distributed Systems @ 6 Hiring @ 3 Communication @ 3 LLM @ 3 GPU @ 3Details
We are seeking a Software Engineering Manager to lead the development for the Dynamo engineering team, NVIDIA’s high-performance, low-latency inference platform for serving generative AI and reasoning workloads at scale. The team accelerates deployment of cutting-edge models across diverse engines and architectures, enabling breakthroughs from real-time LLM serving to complex multi-GPU, multi-node pipelines. The ideal candidate is strong in software development, designing and creating fault-tolerant distributed systems, and has the ability to implement a well thought out long-term maintenance strategy.
Responsibilities
- Mentor, grow, and develop the Dynamo engineering team and be responsible for planning and execution of projects and workflows.
- Work across several teams and orgs to build platforms that use the latest developments in LLM inferencing; collaborate with research and development teams and serve a large user base (software teams both internal and external to NVIDIA).
- Align priorities across collaborators and define metrics for measuring the success of the product/team.
- Stay updated with the latest trends in AI, ML, and infrastructure, proactively seeking opportunities to integrate advancements into NVIDIA's LLM and AI infrastructure solutions.
Requirements
- Master’s or PhD (or equivalent experience) in Computer Science, computer architecture, or related field.
- 10+ years of overall experience in developing large distributed systems.
- 2+ years of experience managing AI and software development teams.
- Experience in developing and maintaining LLM or Generative AI infrastructure.
- Hands-on experience developing large-scale distributed systems and designing fault-tolerant systems.
- Excellent communication, collaboration and problem-solving skills, with a dedication to encouraging an inclusive and diverse workplace.
Ways to stand out
- Strong technical background in cloud/distributed systems.
- Experience working in a globally distributed organization.
- Good knowledge of CPU and/or GPU hardware architecture.
- Background in developing LLM inference systems.
- Experience with LLM frameworks like vLLM & TRT-LLM.
Benefits
- Eligible for equity and company benefits (see NVIDIA benefits page: https://www.nvidia.com/en-us/benefits/).
Location & Work Model
- Location: Santa Clara, California, United States.
- Work model: Hybrid (#LI-Hybrid).
Compensation & Application
- Base salary range: 224,000 USD - 356,500 USD (final base salary determined by location, experience, and pay of employees in similar positions).
- You will also be eligible for equity and benefits.
- Applications accepted at least until September 2, 2025.
NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We do not discriminate in hiring or promotion practices on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.