Used Tools & Technologies
Not specified
Required Skills & Competences ?
Go @ 5 Python @ 5 Java @ 5 GitHub @ 3 Algorithms @ 3 Distributed Systems @ 3 Machine Learning @ 3 Rust @ 5 Experimentation @ 3 LLM @ 3 PyTorch @ 3Details
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Interpretability team is focused on mechanistic interpretability: reverse-engineering how neural network parameters map to meaningful algorithms. The team builds tools and "microscopes" for neural networks, treats models as programs to reverse-engineer, and collaborates across Anthropic (e.g., Alignment Science, Pretraining). Representative publications and resources are linked in the original posting.
Responsibilities
- Implement and analyze research experiments, both quickly in toy scenarios and at scale in large models.
- Set up and optimize research workflows to run efficiently and reliably at large scale.
- Build tools and abstractions to support a rapid pace of research experimentation.
- Develop and improve tools and infrastructure to support other teams in using Interpretability’s work to improve model safety.
Requirements
- 5–10+ years of experience building software.
- Highly proficient in at least one programming language (examples listed: Python, Rust, Go, Java) and productive with Python.
- Some experience contributing to empirical AI research projects.
- Strong ability to prioritize and direct effort toward the most impactful work; comfortable operating with ambiguity and questioning assumptions.
- Prefer fast-moving collaborative projects to extensive solo efforts; want to learn more about machine learning research and collaborate closely with researchers.
- Care about societal impacts and ethics of your work.
- Education: at least a Bachelor's degree in a related field or equivalent experience.
Strong candidates (preferred / nice-to-have)
- Designing a code base so anyone can quickly code experiments, launch them, and analyze results with few bugs.
- Optimizing performance of large-scale distributed systems.
- Collaborating closely with researchers.
- Language modeling with transformers.
- Experience with GPUs or PyTorch.
Representative projects (examples of past or typical work)
- Building Garcon, a tool that allows researchers to easily access LLM internals from a Jupyter notebook.
- Setting up and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them.
- Profiling and optimizing ML training, including parallelizing to many GPUs.
- Making launching ML experiments and manipulating+analyzing the results fast and easy.
- Creating an interactive visualization of attention between tokens in a language model.
Location & Office Policy
- This role is based in the San Francisco office; Anthropic is open to considering exceptional candidates for remote work on a case-by-case basis.
- The company expects staff to be in one of its offices at least 25% of the time (location-based hybrid policy), though some roles may require more time in offices.
Compensation
- Expected base annual salary: $315,000 - $560,000 USD.
- Total compensation package for full-time employees includes equity, benefits, and may include incentive compensation.
Logistics & Other Information
- Visa sponsorship: Anthropic does sponsor visas but cannot guarantee sponsorship for every role/candidate; they retain an immigration lawyer to help when an offer is made.
- Encouragement to apply even if you do not meet every qualification; strong emphasis on diversity and inclusion.
- Candidate guidance on using AI in the application process is provided via a linked policy.
How to Apply / Additional Application Details
- The posting includes an application form with fields for resume/CV, GitHub, publications, written prompts about fit and past work, earliest start date, visa questions, and other standard application items.
(Original posting contained multiple links to team resources, publications, and further reading.)