Research Lead, Training Insights

at Anthropic

📍 New York City, United States
📍 San Francisco, United States

USD 850,000 per year

SENIOR

✅ Hybrid

✅ Visa Sponsorship

Used Tools & Technologies

LLM

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Machine Learning @ 4 Leadership @ 4 Communication @ 7 Mentoring @ 4 AI @ 4 Reinforcement Learning @ 4

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Training Insights team measures and characterizes model capabilities across training and deployment. This Research Lead role is a hands-on leadership position: you will develop evaluation strategy, drive original research into new evaluation methodologies, and lead a small team of researchers and research engineers to measure how capabilities emerge during training and after deployment.

Responsibilities

Build novel and long-horizon evaluations that test model capabilities requiring sustained reasoning, planning, and tool use over extended interactions
Develop measurement approaches to understand how model capabilities emerge and evolve during reinforcement learning (RL) training and after
Lead strategic evaluation coverage across the company and shape the evaluation narrative for model releases
Lead and mentor a small team of researchers and research engineers; set research direction and foster rigorous, creative research
Design evaluation frameworks that balance scientific rigor with production training schedules
Build and maintain cross-organizational relationships (Reinforcement Learning, Pretraining, Inference, Product, Alignment, Safeguards, etc.) to ensure evaluation insights inform training and deployment decisions
Contribute to the broader research community through publications, open-source, or external engagement on evaluation best practices

Requirements

Significant experience designing and running evaluations for large language models or similar complex ML systems
Experience leading technical projects or teams, either formally or via sustained ownership of critical research directions
Comfortable designing experiments and writing code; able to move between research and implementation fluidly
Strategic thinking about what to measure and why, not just how to measure it
Ability to synthesize information across multiple teams and workstreams to form a coherent picture of model capabilities
Strong communication skills for both technical and non-technical audiences
Results-oriented and able to thrive in fast-paced environments with shifting priorities
Care deeply about AI safety and want work to directly influence how capable AI systems are developed and deployed

Additional Qualifications (strong candidates may also have)

Experience building evaluations for long-horizon or agentic tasks
Deep familiarity with reinforcement learning training dynamics and how model behavior changes during training
Published research in machine learning evaluation, benchmarking, or related areas
Experience with safety evaluation frameworks and red teaming methodologies
Background in psychometrics, experimental psychology, or other measurement-focused disciplines
Track record of communicating evaluation results to inform high-stakes decisions about model development or deployment
Experience managing or mentoring researchers and engineers

Representative Projects

Design and implement suites of long-horizon evaluations for sustained reasoning, planning, and tool use
Build systems to track capability development across RL training checkpoints and surface insights about when/how capabilities emerge
Conduct cross-organizational audits of evaluation coverage and prioritize new evaluations to fill gaps across Pretraining, RL, Inference, and Product
Develop evaluation methodology and narrative for major model releases
Research and prototype novel evaluation approaches for capabilities that are difficult to measure with existing benchmarks
Lead efforts to build reusable evaluation infrastructure serving multiple research teams

Compensation

Annual Salary: $850,000 - $850,000 USD

Logistics

Education: At least a Bachelor's degree in a related field or equivalent experience
Location-based hybrid policy: staff are expected to be in one of Anthropic's offices at least 25% of the time (some roles may require more time in office)
Remote-friendly (travel required); role lists San Francisco, CA and New York City, NY as office locations
Visa sponsorship: Anthropic indicates they do sponsor visas and retain an immigration lawyer to help, though sponsorship is not guaranteed for every role/candidate

Benefits

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Office space for collaboration

How Anthropic is different

Anthropic organizes as a cohesive team on a few large-scale research efforts, values impact over smaller puzzles, and treats AI research as an empirical science. Frequent research discussions and strong emphasis on communication are core to the culture.