Research Engineer, Reward Models

USD 315,000-340,000 per year
SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 6 Machine Learning @ 7 Hiring @ 4 LLM @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The team is focused on developing AI systems that are safe and beneficial for society, working collaboratively across research, engineering, policy, and business domains.

Responsibilities

  • Implement novel reward modeling architectures and techniques
  • Optimize training pipelines
  • Build and optimize data pipelines
  • Collaborate with other teams to integrate reward modeling advances into production systems
  • Communicate engineering progress through internal documentation and potential publications

Requirements

  • Strong engineering background in machine learning, with expertise in preference learning, reinforcement learning, deep learning, or related areas
  • Proficiency in Python, deep learning frameworks, and distributed computing
  • Familiarity with modern large language model (LLM) architectures and alignment techniques
  • Experience improving model training and building data pipelines
  • Comfortable with the experimental nature of frontier AI research
  • Ability to implement research ideas and communicate complex technical concepts effectively
  • Deep interest in AI alignment and safety

Experience with reward models is not required but experience with LLMs or other large models is a significant plus. Candidates at various experience levels are welcome, with a preference for senior engineers with hands-on frontier AI systems experience.

Benefits

  • Competitive compensation
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Collaborative office space
  • Visa sponsorship assistance

Logistics

  • Education: Bachelor's degree or equivalent experience required
  • Hybrid work policy: Staff expected to be in office at least 25% of the time
  • Emphasis on diversity and inclusive hiring
  • Collaborative and impactful research environment