Research Engineer, Machine Learning (Horizons)

USD 280,000-425,000 per year
MIDDLE
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kubernetes @ 3 Automated Testing @ 3 Python @ 5 Distributed Systems @ 3 Machine Learning @ 3 TensorFlow @ 3 Communication @ 6 Mathematics @ 3 Rust @ 3 Debugging @ 3 API @ 3 LLM @ 2 PyTorch @ 3 GPU @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Horizons team leads Anthropic's reinforcement learning research and development, contributing to Claude models and working on systems that enable models to use computers, advance code generation via reinforcement learning, pioneer RL research for large language models, and build scalable RL infrastructure and training methodologies.

Responsibilities

  • Collaborate with researchers and engineers to advance capabilities and safety of large language models.
  • Implement novel approaches and contribute to research direction blending research and engineering tasks.
  • Work on fundamental RL research to create agentic models via tool use for open-ended tasks (e.g., computer use, autonomous software generation).
  • Improve model reasoning abilities (e.g., mathematics) and develop prototypes for internal use, productivity, and evaluation.
  • Architect and optimize core reinforcement learning infrastructure: clean training abstractions, distributed experiment management across GPU clusters, and scale systems for complex research workflows.
  • Design, implement, and test novel training environments, evaluations, and methodologies for RL agents.
  • Drive performance improvements across the stack through profiling, optimization, caching solutions, and debugging distributed systems to accelerate training and evaluation workflows.
  • Collaborate across teams to develop automated testing frameworks, design clean APIs, and build scalable infrastructure that accelerates AI research.

Requirements

  • Proficiency in Python and async/concurrent programming (experience with frameworks like Trio).
  • Experience with machine learning frameworks such as PyTorch, TensorFlow, and JAX.
  • Industry experience in machine learning research; ability to balance research exploration with engineering implementation.
  • Strong emphasis on code quality, testing, performance, and systems design.
  • Comfortable with pair programming and strong communication skills.
  • Passion for building safe and beneficial AI systems.
  • Education: at least a Bachelor's degree in a related field or equivalent experience.

Strong candidates may have:

  • Familiarity with LLM architectures and training methodologies.
  • Experience with reinforcement learning techniques and environments.
  • Experience with virtualization and sandboxed code execution environments.
  • Experience with Kubernetes.
  • Experience with distributed systems or high-performance computing.
  • Experience with Rust and/or C++.

Strong candidates need not have:

  • Formal certifications or education credentials.
  • Academic research experience or publication history.

Deadline to apply: None. Applications are reviewed on a rolling basis.

Benefits & Logistics

  • Annual salary range specified (see salary fields).
  • Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office space for collaboration.
  • Location-based hybrid policy: staff are expected to be in an office at least 25% of the time (some roles may require more time in office).
  • Visa sponsorship is available in many cases; Anthropic retains immigration counsel to assist where possible.
  • Encouragement to apply even if candidates do not meet every qualification.

Additional context

  • The team collaborates closely with alignment and frontier red teams and partners with applied production training and RL engineering teams to bring research innovations into deployed models.
  • The role combines empirical, large-scale research with engineering implementation to push forward steerable, trustworthy AI.