Research Scientist, Agentic Learning (Horizons)

USD 300,000-405,000 per year
MIDDLE
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kubernetes @ 3 Automated Testing @ 3 Python @ 5 Distributed Systems @ 3 Machine Learning @ 3 TensorFlow @ 3 Communication @ 6 Rust @ 3 Debugging @ 3 API @ 3 LLM @ 3 PyTorch @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society. The Horizons team leads reinforcement learning research and development, contributing to Claude models and enhancing AI autonomy and coding capabilities.

Responsibilities

  • Architect and optimize core reinforcement learning infrastructure, including training abstractions and distributed experiment management.
  • Design, implement, and test novel model architectures, training environments, and methodologies for reinforcement learning agents.
  • Drive performance improvements through profiling, optimization, benchmarking, caching solutions, and debugging distributed systems.
  • Collaborate across teams to develop automated testing frameworks, clean APIs, and scalable infrastructure accelerating AI research.

Requirements

  • Proficiency in Python.
  • Experience with JAX and PyTorch.
  • Experience designing and iterating on model architecture improvements.
  • Industry experience training and researching machine learning on production scale LLMs.
  • Ability to balance research exploration with engineering implementation.
  • Strong focus on code quality, testing, and performance.
  • Strong systems design and communication skills.
  • Passion for AI’s potential impact and commitment to safe, beneficial systems.

Strong candidates may have

  • Experience with continuous learning and parameter efficient fine-tuning.
  • Experience with TensorFlow.
  • Experience with long range LLM agent designs.
  • Experience with reinforcement learning techniques and environments.
  • Experience with virtualization, sandboxed code execution, Kubernetes, trio or similar libraries.
  • Experience with distributed systems, high-performance computing, Rust and/or C++.
  • Research experience and publications.

Strong candidates need not have

  • Formal certifications or education credentials.

Benefits

Anthropic is a public benefit corporation headquartered in San Francisco offering competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a collaborative office space.

Visa sponsorship is supported with reasonable efforts if an offer is made.

The company embraces diversity and encourages candidates from underrepresented groups to apply.

Logistics

  • Education requirements: At least a Bachelor's degree in related field or equivalent experience.
  • Location-based hybrid policy: Staff expected in office at least 25% of the time; some roles may require more onsite presence.

Salary

  • Annual Salary Range: $300,000 - $405,000 USD