Senior Technical Program Manager, DGX Cloud Software Products And Service
at Nvidia
USD 200,000-322,000 per year
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Software Development @ 4
Kubernetes @ 4
CI/CD @ 3
Distributed Systems @ 4
Machine Learning @ 7
Hiring @ 4
Leadership @ 7
Communication @ 4
IaaS @ 4
Microservices @ 4
API @ 4
Reporting @ 4
Agile @ 4
Deep Learning @ 7
Observability @ 3
AI @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
DGX Cloud team is hiring a Senior Technical Program Manager (TPM) to lead complex, cross-functional software initiatives that support NVIDIA's next-generation AI infrastructure. The role focuses on cloud-native software delivery, Kubernetes-based platforms, infrastructure services, and large-scale AI workloads. The position requires strong technical skills, proactive program leadership, and the ability to align priorities and drive execution across multiple levels of the organization.
Responsibilities
- Lead end-to-end implementation of DGX Cloud software initiatives, including planning, management, delivery, and operationalization across NVIDIA's cloud infrastructure.
- Partner with software, infrastructure, product, and platform engineering teams to align on goals, architecture, deliverables, and schedules.
- Lead initiatives involving Kubernetes-based platforms, cloud-native services, platform APIs, and distributed systems that enable AI training and inference workloads.
- Define and implement scalable program management processes, tools, and guidelines to ensure high execution velocity and program transparency.
- Identify cross-functional dependencies, mitigate risks, and drive resolution of complex technical and programmatic issues across the software stack.
- Establish success metrics and reporting mechanisms to track progress and communicate status to senior leadership.
- Foster collaboration and continuous improvement across engineering, product, and operations teams.
- Develop and implement metrics for assessing program efficiency, collect and analyze data to support planning and data-driven decisions.
- Report on overall program status and provide insights and recommendations to senior management.
- Drive organizational alignment and efficiency by coordinating with multi-functional leads and streamlining processes across software lifecycles and releases.
Requirements
- Postgraduate degree in Computer Science, Artificial Intelligence, or equivalent experience.
- 12+ years of program management experience, including managing global projects across multiple time zones.
- Solid knowledge of cloud-native software systems, Kubernetes, containerized applications, microservices architectures, and infrastructure-as-a-service (IaaS) platforms.
- Practical, hands-on experience working with Kubernetes (required).
- Proven experience driving large-scale software programs in fast-paced engineering environments.
- Strong understanding of software engineering guidelines, release procedures, system integration, and platform delivery.
- Proven ability to creatively resolve technical issues and resource conflicts.
- Detail oriented with the ability to multitask in a dynamic environment with shifting priorities.
- Direct experience working within a dynamic software development environment.
- Excellent communication and technical presentation skills.
- Significant experience with large-scale Agile tools, reporting, and processes relevant to this role.
- Demonstrated skill in engaging and moderating successful engagements with engineering, operations, and product teams.
Preferred / Ways To Stand Out
- Strong background in Machine Learning, Deep Learning, and AI applications.
- Prior experience leading programs for Kubernetes platforms, cloud-native infrastructure, platform services, or developer platforms.
- Experience with software release management, service operationalization, and large-scale platform adoption.
- Familiarity with observability, CI/CD, infrastructure automation, and service reliability practices in cloud environments.
- Track record of driving process improvements and measuring efficiency.
- Familiarity with NVIDIA platforms, products, and ecosystem is a plus.
Additional Details
- Location: Santa Clara, CA, United States (hybrid) — #LI-Hybrid
- Employment type: Full time
- Base salary range: 200,000 USD - 322,000 USD (will be determined by location, experience, and pay of employees in similar positions)
- Eligible for equity and company benefits (link to NVIDIA benefits referenced in the posting).
- Applications accepted until May 26, 2026.
- NVIDIA uses AI tools in its recruiting processes and is an equal opportunity employer.