Research Engineer, Production Model Post-Training, London

GBP 270,000-340,000 per year
MIDDLE
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 5 Distributed Systems @ 3 Communication @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Post-Training team enhances production models through sophisticated post-training processes to improve capabilities, alignment, and safety. As a Research Engineer on this team, you will train base models through the complete post-training stack and deliver production Claude models.

Responsibilities

  • Implement and optimize post-training techniques at scale on frontier models (e.g., Constitutional AI, RLHF, other alignment methodologies).
  • Conduct research to develop and optimize post-training recipes that directly improve production model quality.
  • Design, build, and run robust, efficient pipelines for model fine-tuning and evaluation.
  • Develop tools to measure and improve model performance across various dimensions.
  • Collaborate with research teams to translate emerging techniques into production-ready implementations.
  • Debug complex issues in training pipelines and model behavior.
  • Help establish best practices for reliable, reproducible model post-training.
  • Potentially respond to incidents on short notice, including weekends.

Requirements

  • Proficiency in Python (note: all interviews for this role are conducted in Python).
  • Strong software engineering skills and experience building complex ML systems.
  • Experience with large-scale distributed systems and high-performance computing.
  • Experience with training, fine-tuning, or evaluating large language models (LLMs).
  • Ability to analyze and debug model training processes and balance research exploration with engineering rigor.
  • Bachelor's degree in a related field or equivalent experience (required).
  • Comfortable collaborating across research and engineering disciplines and navigating ambiguity in fast-moving research environments.

Strong candidates may also have:

  • Hands-on experience with LLMs.
  • A keen interest in AI safety and responsible deployment.

Logistics & Additional Information

  • Location: London, United Kingdom. Location-based hybrid policy: staff expected to be in one of Anthropic's offices at least 25% of the time.
  • Visa sponsorship: Anthropic may sponsor visas and retains an immigration lawyer, though sponsorship is not guaranteed for every role.
  • Education: Minimum of a Bachelor's degree in a related field or equivalent experience.
  • Interview note: Interviews conducted in Python.
  • This role may require responding to incidents on short notice, including weekends.

Compensation & Benefits

  • Annual base salary: £270,000 - £340,000 GBP.
  • Total compensation package includes equity, benefits, and may include incentive compensation.
  • Additional benefits mentioned: competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, office space for collaboration.

Company & Culture

  • Anthropic focuses on large-scale, high-impact AI research and values collaboration and communication. The team emphasizes reproducible, empirical research and alignment-focused work. Candidates from a range of experience levels are encouraged to apply; there is a preference for senior engineers with hands-on experience with frontier AI systems.