Research Engineer, Machine Learning (Horizons)

GBP 225,000-340,000 per year
MIDDLE
βœ… Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kubernetes @ 3 Python @ 5 Machine Learning @ 6 Communication @ 3 Mathematics @ 3 Experimentation @ 3 PyTorch @ 5

Details

Anthropic is building reliable, interpretable, and steerable AI systems focused on safety and beneficial outcomes. The Research Engineer on the Reinforcement Learning Fundamentals team will collaborate with researchers and engineers to advance capabilities and safety of large language models through fundamental research in reinforcement learning, improving reasoning in areas such as code generation and mathematics, and exploring reinforcement learning for agentic / open-ended tasks.

Responsibilities

  • Collaborate with researchers and engineers to develop and implement novel reinforcement learning techniques to improve performance and safety of large language models.
  • Create tools, environments, and sandboxed execution systems for models to interact with, enabling them to perform complex, open-ended tasks.
  • Design, implement, and run experiments to enhance models' reasoning capabilities, particularly in code generation and mathematics.
  • Contribute high-quality, tested, and performant code; participate in pair programming and code reviews.
  • Work on infrastructure components related to virtualization, sandboxing, and scalable training/experimentation environments.

Requirements

  • 5+ years of industry-related experience.
  • Proficient in Python and experienced with deep learning frameworks such as PyTorch or JAX.
  • Strong software engineering background with emphasis on code quality, testing, and performance.
  • Comfortable working closely with researchers and other engineers; enjoys pair programming.
  • Passionate about the potential impact of AI and committed to developing safe and beneficial systems.

Strong candidates may also have

  • Strong background in machine learning, reinforcement learning, or high-performance computing.
  • Experience with virtualization and sandboxed code execution environments.
  • Experience with Kubernetes.
  • Contributions to open-source projects or published research papers in relevant fields.

Candidates need not have

  • Formal certifications or education credentials (equivalent experience accepted).
  • Prior experience specifically with LLMs or machine learning research.

Logistics & Application Details

  • Location: London, United Kingdom (role listing: London, UK).
  • Location-based hybrid policy: staff expected to be in an office at least 25% of the time; some roles may require more office time.
  • Education: At least a Bachelor's degree in a related field or equivalent experience is required.
  • Visa sponsorship: Anthropic sponsors visas and retains an immigration lawyer; sponsorship depends on role and candidate but reasonable efforts will be made.
  • Deadline to apply: None (applications reviewed on a rolling basis).
  • Guidance on Candidates' AI Usage: Anthropic provides a policy for using AI in the application process.

Benefits

  • Competitive compensation (see salary range below).
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours and collaborative office space.

About Anthropic

Anthropic is a public benefit corporation headquartered in San Francisco focused on large-scale AI research. The organization emphasizes collaborative, high-impact "big science" research, values clear communication, and encourages diverse perspectives in AI research and development.

Salary

  • Annual Salary: Β£225,000 - Β£340,000 GBP