Research Engineer, Production Model Post-Training, London

at Anthropic

📍 London, United Kingdom

GBP 270,000-340,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 5 Distributed Systems @ 3 Communication @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Post-Training team enhances production models through sophisticated post-training processes to improve capabilities, alignment, and safety. As a Research Engineer on this team, you will train base models through the complete post-training stack and deliver production Claude models.

Responsibilities

Implement and optimize post-training techniques at scale on frontier models (e.g., Constitutional AI, RLHF, other alignment methodologies).
Conduct research to develop and optimize post-training recipes that directly improve production model quality.
Design, build, and run robust, efficient pipelines for model fine-tuning and evaluation.
Develop tools to measure and improve model performance across various dimensions.
Collaborate with research teams to translate emerging techniques into production-ready implementations.
Debug complex issues in training pipelines and model behavior.
Help establish best practices for reliable, reproducible model post-training.
Potentially respond to incidents on short notice, including weekends.

Requirements

Proficiency in Python (note: all interviews for this role are conducted in Python).
Strong software engineering skills and experience building complex ML systems.
Experience with large-scale distributed systems and high-performance computing.
Experience with training, fine-tuning, or evaluating large language models (LLMs).
Ability to analyze and debug model training processes and balance research exploration with engineering rigor.
Bachelor's degree in a related field or equivalent experience (required).
Comfortable collaborating across research and engineering disciplines and navigating ambiguity in fast-moving research environments.

Strong candidates may also have:

Hands-on experience with LLMs.
A keen interest in AI safety and responsible deployment.

Logistics & Additional Information

Location: London, United Kingdom. Location-based hybrid policy: staff expected to be in one of Anthropic's offices at least 25% of the time.
Visa sponsorship: Anthropic may sponsor visas and retains an immigration lawyer, though sponsorship is not guaranteed for every role.
Education: Minimum of a Bachelor's degree in a related field or equivalent experience.
Interview note: Interviews conducted in Python.
This role may require responding to incidents on short notice, including weekends.

Compensation & Benefits

Annual base salary: £270,000 - £340,000 GBP.
Total compensation package includes equity, benefits, and may include incentive compensation.
Additional benefits mentioned: competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, office space for collaboration.

Company & Culture

Anthropic focuses on large-scale, high-impact AI research and values collaboration and communication. The team emphasizes reproducible, empirical research and alignment-focused work. Candidates from a range of experience levels are encouraged to apply; there is a preference for senior engineers with hands-on experience with frontier AI systems.