Machine Learning Systems Engineer - Infrastructure & Runtime, Horizons

at Anthropic

📍 New York City, United States
📍 San Francisco, United States

USD 300,000-405,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Security @ 3 Kubernetes @ 3 Terraform @ 2 Python @ 5 ETL @ 3 Communication @ 3 Performance Optimization @ 3 Rust @ 3 Experimentation @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Horizons team leads Anthropic's reinforcement learning research and development, contributing to Claude models and advancing autonomy and code generation capabilities. The team builds systems for models to use computers, improves code-generation via reinforcement learning, pioneers RL research for LLMs, and constructs scalable RL infrastructure and training methodologies.

Responsibilities

Build and maintain foundational systems that enable AI research, focusing on code execution environments, data pipelines, and performance optimization.
Design and implement high-performance data pipelines for processing large-scale code datasets with emphasis on reliability and reproducibility.
Build and maintain secure sandboxed execution environments using virtualization technologies such as GVisor and Firecracker.
Develop infrastructure for reinforcement learning training environments that balance security and performance.
Optimize resource utilization across distributed computing infrastructure through profiling, benchmarking, and systems-level improvements.
Collaborate with researchers to translate research requirements into scalable, production-grade systems for AI experimentation.

Requirements

Proficiency in Python and async/concurrent programming (experience with frameworks like Trio).
Experience with container technologies and virtualization systems (GVisor, Firecracker mentioned explicitly).
Strong systems programming skills and understanding of performance optimization.
Experience with data pipeline development and ETL processes.
Strong attention to code quality, testing, and performance.
Effective communication with both technical and research-focused team members.
Passion for developing safe and beneficial AI systems.

Strong candidates may have:

Experience with cloud infrastructure and Kubernetes orchestration.
Familiarity with infrastructure-as-code tools (Terraform, Pulumi, etc.).
Experience contributing to open-source projects in systems or infrastructure.
Knowledge of Rust and/or C++ for performance-critical components.
Experience implementing security controls for code execution.
Comfort engaging with ML research concepts and translating them to engineering requirements.

Strong candidates need not have:

Formal certifications or education credentials.
Prior experience with LLMs, reinforcement learning, or ML research.

Logistics

Location: San Francisco, CA and New York City, NY (positions based in either office). Location-based hybrid policy: staff are expected to be in one of the offices at least 25% of the time; some roles may require more time onsite.
Education: At least a Bachelor's degree in a related field or equivalent experience is required.
Visa sponsorship: Anthropic does sponsor visas and will make reasonable efforts and retain immigration counsel when an offer is made, though sponsorship is not guaranteed for every role/candidate.
Deadline to apply: None. Applications are reviewed on a rolling basis.

Benefits

Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office space for collaboration.
Emphasis on collaborative research culture and high-impact AI research.

Representative projects (examples)

High-performance data pipeline design for large-scale code datasets.
Secure sandboxed execution environments built on virtualization tech (GVisor, Firecracker).
Infrastructure for RL training environments that balance security and performance needs.
Profiling and benchmarking to optimize distributed compute resource utilization.
Translating research prototypes into scalable systems for experimentation.

About Anthropic & Horizons

Anthropic is a public benefit corporation headquartered in San Francisco. The Horizons team sits at the intersection of cutting-edge research and engineering, collaborating closely with alignment and frontier red teams and applied production training teams to ensure systems are capable, safe, and deployable.