Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Software Development @ 7 Ceph @ 4 DevOps @ 7 GCP @ 3 AWS @ 3 Azure @ 3 Communication @ 7 Performance Monitoring @ 4 Jira @ 6 Product Management @ 4 Compliance @ 4 Agile @ 7Details
NVIDIA’s DGX Cloud is redefining how organizations deploy and scale AI infrastructure. We’re looking for a Senior Technical Program Manager to drive storage-related initiatives across development, operations, and cloud deployment. This is a high-impact role interfacing with engineering, product, operations, finance, and our global cloud partners.
Responsibilities
- Lead cross-functional storage programs from requirements gathering through execution and delivery.
- Drive alignment across NVIDIA storage engineering, operations, cloud service providers, cluster operators, resource governance and finance.
- Define project plans, schedules, and achievements for storage features, storage deployments, support, security, compliance, and observability.
- Partner with the engineering team and product management to define and deliver product roadmaps.
- Manage technical risks and resolve blockers that impact quality, scope, and delivery timelines.
- Coordinate with cross-functional teams to improve workflows, efficiency, and transparency.
- Ensure program visibility across the organization and maintain strong communication channels with senior stakeholders.
- Improve organizational efficiency by collaborating with multi-functional leads and optimizing processes.
- Cultivate a culture of continuous improvement, finding opportunities for process enhancements.
Requirements
- 12+ years of experience in program management of large-scale software or infrastructure projects.
- MS in EE or CS, or equivalent experience.
- Proven success driving programs across global, distributed teams.
- Outstanding communication and organizational skills, with the ability to align cross-org stakeholders.
- Expertise with tools like Jira and Confluence, and the ability to guide teams in their use.
- Strong foundation in software development, Agile methodologies, and DevOps best practices.
- Familiarity with cloud platforms and storage services (AWS, Azure, GCP, OCI — Block, Object, File).
- Knowledge of distributed storage systems: SAN, NAS, object storage, and scalable distributed architectures such as Ceph or Lustre.
- Understanding of storage performance metrics and optimization (IOPS, latency, throughput) and capacity planning for large-scale environments.
- Familiarity with data protection and disaster recovery strategies: snapshots, backups, replication, DR planning.
- Understanding of storage requirements for AI/ML and HPC workloads (high-throughput AI training and data pipelines).
Ways to stand out
- Hands-on experience with storage operations, provisioning, performance monitoring, and troubleshooting.
- Experience with new product introduction and program managing research teams.
Compensation & Benefits
- Base salary range: 192,000 USD - 304,750 USD (final base salary determined by location, experience, and pay of employees in similar positions).
- Eligible for equity and benefits.
Additional information
- Location: Santa Clara, CA, United States.
- Employment type: Full time.
- Posting includes "#LI-Hybrid" (hybrid work model).
- Applications accepted at least until December 22, 2025.
- NVIDIA is an equal opportunity employer committed to diversity and inclusion.