Used Tools & Technologies
Not specified
Required Skills & Competences ?
Algorithms @ 3 Machine Learning @ 3 Hiring @ 3 Leadership @ 3 People Management @ 3 Communication @ 6 Prioritization @ 6 Project Management @ 6Details
Anthropic’s Interpretability team researches mechanistic interpretability of large language models to discover how neural network parameters map to meaningful algorithms and to build a scientific foundation for making models reliable, interpretable, and steerable. The team focuses on reverse-engineering model computation ("circuits"), resolving superposition, decomposing models into interpretable components, and applying these methods to production models (e.g., Claude series).
Responsibilities
- Partner with a research lead on direction, project planning and execution, hiring, and people development
- Set and maintain a high bar for execution speed and quality; identify process improvements to help the team operate effectively
- Coach and support team members to increase impact and develop their careers
- Drive the team's recruiting efforts, including hiring planning, process improvements, sourcing, and closing
- Help identify and support opportunities for collaboration with other teams across Anthropic
- Communicate team updates and results to other teams and leadership
- Maintain a deep understanding of the team's technical work and its implications for AI safety
Requirements
- Experienced manager (minimum ~2–5 years) with a track record of effectively leading highly technical research and/or engineering teams
- Background in machine learning, AI, or a related technical field
- Active enjoyment of people management and experience with coaching, mentorship, performance evaluation, career development, and hiring for technical roles
- Strong project management skills, including prioritization and cross-functional coordination
- Experience managing technical teams through periods of ambiguity and change
- Quick learner, capable of understanding and contributing to discussions on complex technical topics
- Strong written and verbal communication skills
- Commitment to AI safety and the mission of building steerable, trustworthy AI systems
Strong candidates may also have
- Experience scaling engineering infrastructure
- Experience working on open-ended, exploratory research agendas aimed at foundational insights
- Familiarity with mechanistic interpretability and related research (e.g., circuit-based interpretability, feature discovery)
Role-specific location policy
- This role is expected to be in the San Francisco office ~3 days a week (hybrid). The company expects staff to be in one of Anthropic's offices at least ~25% of the time; some roles may require more time in-office.
Logistics & Education
- Minimum: Bachelor's degree in a related field or equivalent experience
- Visa sponsorship: Anthropic may sponsor visas and will make reasonable efforts if an offer is made
Benefits
- Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and an office in San Francisco
Additional notes
- The team emphasizes collaborative, large-scale research efforts and values strong communication and impact-driven work. Familiarity with the team's prior work (circuits, monosemanticity, activation atlases, etc.) is helpful but not strictly required.