Used Tools & Technologies
Not specified
Required Skills & Competences ?
Kubernetes @ 3 Machine Learning @ 3 Communication @ 3 NLP @ 3 LLM @ 3Details
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society. The team is a fast-growing group of researchers, engineers, policy experts, and leaders building beneficial AI systems.
About the Role
Build and run elegant and thorough machine learning experiments to understand and steer the behavior of powerful AI systems. Focus on making AI helpful, honest, and harmless, especially addressing challenges related to human-level capabilities. Work as both a scientist and engineer in exploratory experimental research on AI safety, concentrating on risks from future powerful systems (ASL-3 or ASL-4) often collaborating with Interpretability, Fine-Tuning, and Frontier Red Team.
Research Areas
- AI Control: Methods to keep advanced AI safe and harmless in unfamiliar/adversarial scenarios.
- Alignment Stress-testing: Develop model organisms of misalignment to understand alignment failures empirically.
Representative Projects
- Test robustness of safety techniques by training language models to subvert them.
- Run multi-agent reinforcement learning experiments (e.g., AI Debate).
- Build tools to evaluate effectiveness of LLM-generated jailbreaks.
- Write scripts and prompts for evaluation questions assessing model reasoning in safety contexts.
- Contribute ideas, visuals, and writing to research outputs.
- Run experiments supporting key AI safety efforts like the Responsible Scaling Policy.
Candidate Profile
Required
- Significant software, ML, or research engineering experience.
- Experience contributing to empirical AI research.
- Familiarity with technical AI safety research.
- Enjoy fast-moving collaborative projects.
- Willingness to take on tasks beyond strict job description.
- Care about AI impacts.
Strong Candidates May Also Have
- Authored research papers in ML, NLP, or AI safety.
- Experience with LLMs.
- Experience with reinforcement learning.
- Experience managing Kubernetes clusters and complex shared codebases.
Not Required
- 100% of listed skills.
- Formal certifications or degrees.
Salary
Annual salary range: £250,000 - £270,000 GBP
Logistics
- Requires at least a Bachelor's degree or equivalent experience.
- Hybrid location policy, expected to be in office at least 25% of the time.
- Based in London with occasional travel to San Francisco.
- Visa sponsorship available with reasonable effort.
How We're Different
- Focus on large-scale impactful AI research.
- Emphasize empirical science with collaboration and communication.
- Research informed by past work including GPT-3, Interpretability, Scaling Laws, and AI Safety.
Benefits
- Competitive compensation and benefits.
- Optional equity donation matching.
- Generous vacation and parental leave.
- Flexible working hours.
- Collaborative and lovely office space.
Additional
- Guidance provided on candidates’ AI usage during application process.