Research Manager, Interpretability

at Anthropic

📍 San Francisco, United States

USD 340,000-425,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Algorithms @ 3 Machine Learning @ 3 Hiring @ 3 Leadership @ 3 People Management @ 3 Prioritization @ 6 Project Management @ 6

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. The Interpretability team’s mission is to reverse engineer how trained models work; mechanistic interpretability aims to discover how neural network parameters map to meaningful algorithms. The team focuses on mechanistic interpretability (treating neural networks like systems to be reverse engineered) and has produced multiple publications and methods for decomposing models into interpretable components.

About the role

As a manager on the Interpretability team, you will support a team of expert researchers and engineers working to understand—at a mechanistic level—how modern large language models operate internally. You will partner closely with an individual contributor research lead to translate research ideas into tangible goals, oversee execution, manage team performance and careers, facilitate cross-team relationships, and drive hiring.

If you prefer to make individual technical contributions instead of focusing on management, Anthropic suggests applying to their Research Scientist or Research Engineer roles instead.

Responsibilities

Partner with a research lead on direction, project planning and execution, hiring, and people development
Set and maintain a high bar for execution speed and quality; identify process improvements
Coach and support team members to increase impact and develop careers
Drive the team's recruiting efforts, including hiring planning, process improvements, and sourcing and closing
Help identify and support opportunities for collaboration with other teams across Anthropic
Communicate team updates and results to other teams and leadership
Maintain a deep understanding of the team's technical work and its implications for AI safety

Requirements / Qualifications (You may be a good fit if you)

Are an experienced manager (minimum 2–5 years) with a track record of effectively leading highly technical research and/or engineering teams
Have a background in machine learning, AI, or a related technical field
Actively enjoy people management and are experienced with coaching and mentorship, performance evaluation, career development, and hiring for technical roles
Have strong project management skills, including prioritization and cross-functional coordination and collaboration
Have managed technical teams through periods of ambiguity and change
Are a quick learner, capable of understanding and contributing to discussions on complex technical topics and are motivated to learn about the research
Are a strong communicator both in speaking and in writing
Believe that advanced AI systems could have a transformative effect on the world and are passionate about helping make sure that transformation goes well

Strong candidates may also have

Experience scaling engineering infrastructure
Experience working on open-ended, exploratory research agendas aimed at foundational insights
Some familiarity with Anthropic’s work and mechanistic interpretability

Role specific location / office policy

This role is expected to be in Anthropic’s San Francisco office for 3 days a week (hybrid policy).
The organization expects staff to be in one of their offices at least 25% of the time; some roles may require more time in office.

Compensation

Expected base annual salary: $340,000 - $425,000 USD
Total compensation package includes equity, benefits, and may include incentive compensation

Logistics

Education requirements: At least a Bachelor's degree in a related field or equivalent experience
Visa sponsorship: Anthropic does sponsor visas where possible and retains an immigration lawyer to assist when they make an offer
Applicants are encouraged to apply even if they do not meet every single qualification

How to apply / other information

The posting requests applicants to provide resume or LinkedIn plus several role-specific application questions (e.g., experience leading open-ended research, line management details, achievements as a manager, visa questions, and preferences about in-office time).
The posting includes links to representative Interpretability publications and resources for learning about the team and their research.