Research Engineer, Reward Models

USD 315,000-340,000 per year
MIDDLE SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 3 GitHub @ 3 Machine Learning @ 6 LLM @ 2

Details

Anthropic’s Reward Modeling team is developing techniques to teach AI systems to understand and embody human values while pushing forward AI capabilities. This expression-of-interest posting is for engineers who want to work on reward modeling for large language models and frontier AI research. Note: interviews for this role are conducted in Python. Headcount for 2025 is filled; applications submitted now will be reviewed as the team grows.

Responsibilities

  • Help implement novel reward modeling architectures and techniques
  • Optimize training pipelines
  • Build and optimize data pipelines
  • Collaborate across teams to integrate reward modeling advances into production systems
  • Communicate engineering progress through internal documentation and potential publications

Requirements

  • Strong engineering background in machine learning with demonstrable expertise in preference learning, reinforcement learning, deep learning, or related areas
  • Proficiency in Python (required) and experience with deep learning frameworks
  • Experience with distributed computing
  • Familiarity with modern LLM architectures and alignment techniques
  • Experience improving model training pipelines and building data pipelines
  • Comfortable with the experimental nature of frontier AI research and able to implement research ideas
  • Ability to clearly communicate complex technical concepts and research findings
  • Deep interest in AI alignment and safety
  • Education: at least a Bachelor’s degree in a related field or equivalent experience

Notes from the posting:

  • Experience with reward models is not required; experience with LLMs or other large models is a significant plus
  • The team welcomes candidates at various experience levels, with a preference for senior engineers
  • Interviews for this role are conducted in Python

Compensation

  • Annual base salary: $315,000 - $340,000 USD
  • The total compensation package for full-time employees includes equity, benefits, and may include incentive compensation

Logistics & Policy

  • Locations: San Francisco, CA; New York City, NY; Seattle, WA (United States)
  • Location-based hybrid policy: currently expect all staff to be in one of our offices at least 25% of the time
  • Visa sponsorship: Anthropic does sponsor visas in many cases and retains an immigration lawyer to assist

Benefits / Why Anthropic

  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours and office space for collaboration
  • Emphasis on large-scale, collaborative research and frequent research discussions

Application details

  • This is an expression-of-interest form; you may not hear back immediately
  • Applicants are asked for resume or LinkedIn, optional cover letter, GitHub, publications, and answers about availability and relocation
  • Guidance on candidates' AI usage is provided in Anthropic's candidate AI policy