Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Algorithms @ 3
Communication @ 3
Networking @ 3
Reporting @ 3
GPU @ 3
AI @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
About the Team
The Stargate team is responsible for building the physical infrastructure that powers OpenAI’s largest-scale AI systems. We design, deploy, and operate next-generation data center infrastructure across a rapidly expanding footprint, bringing together hardware, networking, facilities, supply chain, and deployment execution.
This work sits at the intersection of advanced AI hardware and real-world infrastructure delivery. Our team turns compute requirements into deployable, reliable, and scalable systems that support frontier AI workloads.
About the Role
We are seeking a Hardware Operations Technical Program Manager to drive execution across the lifecycle of AI infrastructure hardware programs. In this role you will own cross-functional program execution across hardware readiness, supplier coordination, deployment planning, rack-level integration, manufacturing operations, logistics, field deployment, and operational handoff. You will partner closely with hardware engineering, data center engineering, networking, supply chain, manufacturing, deployment, and operations teams to ensure critical infrastructure programs move from design intent to production readiness.
This role is ideal for someone who can operate at both the technical and programmatic level: understanding hardware systems, identifying operational blockers, driving accountability across teams, and creating scalable processes for high-volume infrastructure deployment.
Responsibilities
- Drive end-to-end Hardware Operations readiness programs across AI infrastructure systems, including servers, racks, networking hardware, power and cooling interfaces, and related data center infrastructure.
- Develop and operationalize scalable hardware operations processes, workflows, and support models spanning deployment, repair operations, diagnostics, break/fix, escalation management, and sustaining operations.
- Lead cross-functional execution of Hardware Operations readiness initiatives, ensuring operational capabilities, tooling, documentation, staffing models, and workflows are established prior to production deployment and operational handoff.
- Partner across Hardware Engineering, Manufacturing, Supply Chain, Data Center Operations, Network Operations, Deployment, Reliability Engineering, and external suppliers to ensure alignment on operational requirements, supportability, and readiness milestones.
- Develop operational scorecards, reporting frameworks, and metric algorithms to measure hardware operational health, repair performance, deployment quality, readiness status, and execution efficiency.
- Identify operational, technical, supplier, tooling, and process risks early; drive mitigation plans, cross-functional alignment, and executive-level communication.
- Lead cross-functional issue resolution efforts during hardware deployment, validation, operational ramp, and sustaining operations, ensuring rapid containment, corrective action development, and long-term process improvement.
- Create and mature operational governance models, including standardized readiness reviews, action tracking, escalation management, performance reviews, and operational business rhythms.
- Ensure operational knowledge sharing and alignment across internal teams, external suppliers, and infrastructure partners to improve execution consistency, issue resolution efficiency, and operational maturity.
You Might Thrive in This Role If You
- Have experience driving complex hardware or infrastructure programs from development through production and deployment.
- Are comfortable operating across engineering, manufacturing, supply chain, deployment, and operations teams.
- Can understand technical system dependencies without needing to be the deepest engineer in every domain.
- Know how to create structure in ambiguous, fast-moving environments.
- Are effective at driving accountability across teams and vendors without direct authority.
- Can move between tactical execution details and executive-level communication.
- Have strong judgment around when to escalate, when to unblock directly, and when to create a repeatable process.
Qualifications
- 7+ years of experience in technical program management, hardware operations, manufacturing operations, infrastructure deployment, or related technical execution roles.
- Experience supporting hardware systems at scale, ideally including servers, racks, networking hardware, data center infrastructure, or high-performance compute environments.
- Strong understanding of hardware development and deployment lifecycle, including NPI, qualification, manufacturing ramp, logistics, installation, validation, and operational support.
- Demonstrated ability to manage complex cross-functional schedules, dependencies, risks, and executive communications.
- Strong technical fluency across hardware systems, rack integration, manufacturing readiness, and infrastructure deployment.
- Proven ability to operate in ambiguous environments and create scalable execution mechanisms.
- Excellent written and verbal communication skills, with the ability to influence technical and non-technical stakeholders.
- Experience with AI infrastructure, hyperscale data centers, cloud infrastructure, or high-density compute systems is a plus.
- Bachelor’s degree in engineering, computer science, operations, supply chain, or equivalent practical experience.
Preferred Skills
- Experience with GPU, accelerator, server, rack, or cluster-scale infrastructure programs.
- Background in hardware operations, manufacturing program management, supply chain operations, data center deployment, or technical infrastructure TPM.
- Familiarity with rack integration, power/cooling constraints, cabling, networking, serviceability, and deployment readiness.
- Experience building operating rhythms, readiness reviews, risk registers, launch dashboards, or executive program reviews.
- Experience scaling programs from prototype/NPI into repeatable production deployment.
Benefits
- Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
- Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
- 401(k) retirement plan with employer match
- Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
- Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
- 13+ paid company holidays and multiple paid coordinated company office closures throughout the year, plus paid sick or safe time
- Mental health and wellness support
- Employer-paid basic life and disability coverage
- Annual learning and development stipend
- Daily meals in offices and meal delivery credits as eligible
- Relocation support for eligible employees
- Additional taxable fringe benefits such as charitable donation matching and wellness stipends
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. OpenAI is an equal opportunity employer and is committed to providing reasonable accommodations to applicants with disabilities. Background checks will be administered in accordance with applicable law.