Research Engineer, Reward Models

at Anthropic

📍 New York City, United States
📍 San Francisco, United States
📍 Seattle, United States

USD 315,000-340,000 per year

MIDDLE SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 3 GitHub @ 3 Machine Learning @ 6 LLM @ 2

Details

Anthropic’s Reward Modeling team is developing techniques to teach AI systems to understand and embody human values while pushing forward AI capabilities. This expression-of-interest posting is for engineers who want to work on reward modeling for large language models and frontier AI research. Note: interviews for this role are conducted in Python. Headcount for 2025 is filled; applications submitted now will be reviewed as the team grows.

Responsibilities

Help implement novel reward modeling architectures and techniques
Optimize training pipelines
Build and optimize data pipelines
Collaborate across teams to integrate reward modeling advances into production systems
Communicate engineering progress through internal documentation and potential publications

Requirements

Strong engineering background in machine learning with demonstrable expertise in preference learning, reinforcement learning, deep learning, or related areas
Proficiency in Python (required) and experience with deep learning frameworks
Experience with distributed computing
Familiarity with modern LLM architectures and alignment techniques
Experience improving model training pipelines and building data pipelines
Comfortable with the experimental nature of frontier AI research and able to implement research ideas
Ability to clearly communicate complex technical concepts and research findings
Deep interest in AI alignment and safety
Education: at least a Bachelor’s degree in a related field or equivalent experience

Notes from the posting:

Experience with reward models is not required; experience with LLMs or other large models is a significant plus
The team welcomes candidates at various experience levels, with a preference for senior engineers
Interviews for this role are conducted in Python

Compensation

Annual base salary: $315,000 - $340,000 USD
The total compensation package for full-time employees includes equity, benefits, and may include incentive compensation

Logistics & Policy

Locations: San Francisco, CA; New York City, NY; Seattle, WA (United States)
Location-based hybrid policy: currently expect all staff to be in one of our offices at least 25% of the time
Visa sponsorship: Anthropic does sponsor visas in many cases and retains an immigration lawyer to assist

Benefits / Why Anthropic

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours and office space for collaboration
Emphasis on large-scale, collaborative research and frequent research discussions

Application details

This is an expression-of-interest form; you may not hear back immediately
Applicants are asked for resume or LinkedIn, optional cover letter, GitHub, publications, and answers about availability and relocation
Guidance on candidates' AI usage is provided in Anthropic's candidate AI policy