Research Engineer, Production Model Post-Training, London
at Anthropic
GBP 270,000-340,000 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 5 Distributed Systems @ 3 Communication @ 3Details
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Post-Training team enhances production models through sophisticated post-training processes to improve capabilities, alignment, and safety. As a Research Engineer on this team, you will train base models through the complete post-training stack and deliver production Claude models.
Responsibilities
- Implement and optimize post-training techniques at scale on frontier models (e.g., Constitutional AI, RLHF, other alignment methodologies).
- Conduct research to develop and optimize post-training recipes that directly improve production model quality.
- Design, build, and run robust, efficient pipelines for model fine-tuning and evaluation.
- Develop tools to measure and improve model performance across various dimensions.
- Collaborate with research teams to translate emerging techniques into production-ready implementations.
- Debug complex issues in training pipelines and model behavior.
- Help establish best practices for reliable, reproducible model post-training.
- Potentially respond to incidents on short notice, including weekends.
Requirements
- Proficiency in Python (note: all interviews for this role are conducted in Python).
- Strong software engineering skills and experience building complex ML systems.
- Experience with large-scale distributed systems and high-performance computing.
- Experience with training, fine-tuning, or evaluating large language models (LLMs).
- Ability to analyze and debug model training processes and balance research exploration with engineering rigor.
- Bachelor's degree in a related field or equivalent experience (required).
- Comfortable collaborating across research and engineering disciplines and navigating ambiguity in fast-moving research environments.
Strong candidates may also have:
- Hands-on experience with LLMs.
- A keen interest in AI safety and responsible deployment.
Logistics & Additional Information
- Location: London, United Kingdom. Location-based hybrid policy: staff expected to be in one of Anthropic's offices at least 25% of the time.
- Visa sponsorship: Anthropic may sponsor visas and retains an immigration lawyer, though sponsorship is not guaranteed for every role.
- Education: Minimum of a Bachelor's degree in a related field or equivalent experience.
- Interview note: Interviews conducted in Python.
- This role may require responding to incidents on short notice, including weekends.
Compensation & Benefits
- Annual base salary: £270,000 - £340,000 GBP.
- Total compensation package includes equity, benefits, and may include incentive compensation.
- Additional benefits mentioned: competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, office space for collaboration.
Company & Culture
- Anthropic focuses on large-scale, high-impact AI research and values collaboration and communication. The team emphasizes reproducible, empirical research and alignment-focused work. Candidates from a range of experience levels are encouraged to apply; there is a preference for senior engineers with hands-on experience with frontier AI systems.