Technical Program Manager, Infrastructure

USD 290,000-365,000 per year
MIDDLE
✅ Hybrid
✅ Visa Sponsorship

Used Tools & Technologies

Machine Learning

Required Skills & Competences

Security @ 3 Kubernetes @ 3 GCP @ 3 CI/CD @ 3 Distributed Systems @ 5 AWS @ 3 Azure @ 3 GPU @ 3 Observability @ 2 AI @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Responsibilities

Developer Productivity & Tooling

  • Drive cross-functional programs to improve developer environments, CI/CD infrastructure, and release processes that enable rapid innovation while maintaining high security standards
  • Coordinate large-scale migrations and platform modernization efforts across engineering teams
  • Partner with teams to measure and improve developer productivity metrics, identifying bottlenecks and driving systematic improvements
  • Lead initiatives to integrate AI tools into development workflows, helping Anthropic be at the forefront of AI-assisted research and engineering

Infrastructure Reliability & Operations

  • Drive programs to establish and achieve reliability targets across training infrastructure and production services
  • Coordinate incident response improvements, post-mortem processes, and on-call rotations that help teams operate effectively
  • Establish metrics and dashboards to track infrastructure health, capacity utilization, and operational excellence

Cross-functional Coordination

  • Serve as the critical bridge between infrastructure teams, research, and product, translating technical complexities into clear updates for a variety of audiences
  • Consult with stakeholders to deeply understand infrastructure, data, and compute needs, identifying solutions to support frontier research and product development
  • Drive alignment on priorities and timelines across teams with competing constraints

Requirements

  • 5+ years of technical program management experience, with a track record of successfully delivering complex infrastructure programs in ML/AI systems or large-scale distributed systems
  • Deep technical understanding of infrastructure systems—enough to engage substantively with engineers, identify technical risks, and add value beyond project tracking
  • Ability to create structure and processes in ambiguous environments, bringing clarity to complex cross-team initiatives
  • Strong stakeholder management skills and ability to build trust with both technical and non-technical partners
  • Comfortable navigating competing priorities and using data to drive technical decisions
  • Experience with developer productivity initiatives, CI/CD systems, or infrastructure scaling
  • Passion for reliability, scalability, security, and continuous improvement
  • Passion for supporting internal partners like research to understand their unique needs
  • Passionate about AI infrastructure and understand the unique challenges of building and operating systems at frontier scale
  • Experience with Kubernetes, cloud platforms (AWS, GCP, Azure), and ML infrastructure (GPU/TPU/Trainium clusters)
  • Background working with research teams and translating their needs into concrete technical requirements
  • Experience driving adoption of AI tools to improve engineering productivity
  • Familiarity with observability tooling and practices

Logistics

  • Education requirements: at least a Bachelor's degree in a related field or equivalent experience
  • Location-based hybrid policy: currently, staff are expected to be in one of Anthropic's offices at least 25% of the time (some roles may require more time in-office)
  • Visa sponsorship: Anthropic states they do sponsor visas and retain an immigration lawyer to help, though sponsorship is not guaranteed for every role/candidate
  • Deadline to apply: None (applications accepted on a rolling basis)

Compensation

  • Annual Salary: $290,000 - $365,000 USD

About Anthropic / How we're different

  • We believe the highest-impact AI research will be big science and work as a single cohesive team on a few large-scale research efforts
  • We value impact over smaller, specific puzzles, and host frequent research discussions to pursue high-impact work
  • Research directions include topics related to GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences

Benefits

  • Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and an office space for collaboration

Other notes

  • Applicants are encouraged to apply even if they do not meet every qualification listed
  • Guidance on candidate AI usage is provided (link in original posting)
  • Anthropic recruiters only contact from @anthropic.com addresses and will not ask for money or banking information before the first day