Research Manager, Interpretability

at Anthropic

📍 San Francisco, United States

USD 340,000-425,000 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Algorithms @ 4 Machine Learning @ 4 Hiring @ 4 Leadership @ 4 People Management @ 4 Communication @ 7 Prioritization @ 7 Project Management @ 7

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society. The Interpretability team focuses on mechanistic interpretability—reverse engineering how neural network parameters map to meaningful algorithms—to build a solid scientific foundation for understanding and making neural networks safe.

Responsibilities

Partner with a research lead on direction, project planning and execution, hiring, and people development
Set and maintain high standards for execution speed and quality, improving processes for team efficiency
Coach and support team members for greater impact and career development
Drive recruiting efforts including planning, process improvements, sourcing, and closing
Identify and support collaboration opportunities across Anthropic
Communicate updates and results to other teams and leadership
Maintain deep understanding of the team's technical work and its AI safety implications

Requirements

Minimum 2-5 years experience managing highly technical research and/or engineering teams
Background in machine learning, AI, or related technical field
Enjoy people management with experience in coaching, mentorship, performance evaluation, career development, and hiring
Strong project management skills with prioritization and cross-functional collaboration
Experience managing teams through ambiguity and change
Quick learner with motivation to understand complex technical topics and research
Strong verbal and written communication skills
Passionate about ensuring advanced AI systems have a positive transformative effect on the world

Strong Candidates May Also Have

Experience scaling engineering infrastructure
Experience with open-ended, exploratory foundational research agendas
Familiarity with mechanistic interpretability research

Location and Office Policy

Role expected to be onsite at San Francisco office 3 days a week
Hybrid work model requiring minimum 25% office presence

Compensation and Benefits

Annual salary range: $340,000 - $425,000 USD
Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Supportive and collaborative office environment in San Francisco

Additional Information

Requires at least a Bachelor's degree or equivalent experience
Visa sponsorship available; subject to role and candidate eligibility
Inclusive application encouraged regardless of meeting all qualifications

Anthropic values collaborative, high-impact AI research aiming to produce trustworthy AI systems emphasizing communication and team cohesion.