Used Tools & Technologies
Not specified
Required Skills & Competences ?
Go @ 5 Python @ 5 Java @ 5 GitHub @ 3 Algorithms @ 3 Distributed Systems @ 3 Machine Learning @ 3 Communication @ 3 Rust @ 5 Experimentation @ 3 LLM @ 3 PyTorch @ 3Details
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Interpretability team focuses on mechanistic interpretability — discovering how neural network parameters map to meaningful algorithms — and builds tools and infrastructure to reverse-engineer how trained models work. Team work spans empirical research, tooling, large-scale data pipelines, and collaborations across Anthropic to improve model safety.
Responsibilities
- Implement and analyze research experiments, both quickly in toy scenarios and at scale in large models
- Set up and optimize research workflows to run efficiently and reliably at large scale
- Build tools and abstractions to support a rapid pace of research experimentation
- Develop and improve tools and infrastructure to support other teams in using Interpretability’s work to improve model safety
Requirements
- 5–10+ years of experience building software
- Highly proficient in at least one programming language (examples listed: Python, Rust, Go, Java) and productive with Python
- Some experience contributing to empirical AI research projects
- Strong ability to prioritize and direct effort toward the most impactful work; comfortable operating with ambiguity and questioning assumptions
- Prefer collaborative, fast-moving projects
- Interest in learning machine learning research and collaborating closely with researchers
- Care about societal impacts and ethics of work
- Education: at least a Bachelor’s degree in a related field or equivalent experience
Strong candidates may also have experience with
- Designing a codebase so others can quickly run experiments, launch them, and analyze results without hitting bugs
- Optimizing performance of large-scale distributed systems
- Collaborating closely with researchers
- Language modeling with transformers
- GPUs or PyTorch
Representative projects / examples of work
- Building Garcon: a tool that allows researchers to easily access LLM internals from a Jupyter notebook
- Setting up and optimizing a pipeline to collect petabytes of transformer activations and shuffle them
- Profiling and optimizing ML training, including parallelizing to many GPUs
- Making launching ML experiments and analyzing results fast and easy
- Creating interactive visualizations of attention between tokens in a language model
Location & Office Policy
- This role is based in the San Francisco office; exceptional candidates may be considered for remote work on a case-by-case basis.
- Currently, staff are expected to be in one of Anthropic’s offices at least 25% of the time. Some roles may require more in-office time.
Compensation
- Annual base salary range: $315,000 - $560,000 USD
- Total compensation for full-time employees includes equity, benefits, and may include incentive compensation
Logistics
- Visa sponsorship: Anthropic does sponsor visas where feasible and retains immigration counsel to assist where possible
- Application materials: resume/CV or LinkedIn profile required; optional cover letter and links (GitHub, publications)
Why Anthropic / Culture
- Team values large-scale, high-impact empirical AI research and collaborative work across disciplines
- Emphasis on communication, safety, and societal/ethical implications of AI
How to apply
- Submit application via Anthropic’s careers page. Anthropic encourages applicants from diverse backgrounds and those who may not meet every listed qualification to apply.