Research Engineer / Scientist, Tool Use Safety

at Anthropic

📍 New York City, United States
📍 San Francisco, United States

USD 315,000-425,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Security @ 3 Python @ 3 Machine Learning @ 6 Communication @ 3 Mathematics @ 6 LLM @ 5 Claude Code @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society. The Tool Use Team within Research focuses on foundational problems for tool use and agentic applications, including tool use safety (e.g., prompt injection robustness), tool call accuracy, long-horizon and complex tool-use workflows, large-scale and dynamic tools, and tool-use efficiency. The team’s work underpins internal products (Claude for Chrome, Computer Use, Claude Code, Search) and customer-facing agentic applications. You will collaborate with researchers and engineers to advance safe tool use in Claude and own the full research lifecycle — from identifying limitations to implementing solutions that ship in production models. Note: interviews for this role are conducted in Python.

Responsibilities

Design and implement novel and scalable reinforcement learning methodologies that advance tool use safety
Define and pursue research agendas that push the boundaries of what's possible
Build rigorous, realistic evaluations that capture the complexity of real-world tool use safety challenges
Ship research advances that directly impact and protect millions of users
Collaborate with safety research (Safeguards, Alignment Science), capabilities research, and product teams to drive breakthroughs in safety and ship them into production
Design, implement, and debug code across research and production ML stacks
Contribute to a collaborative research culture via pair programming, technical discussions, and team problem-solving

Requirements

Passion for AI safety and Anthropic’s mission
Strong machine learning research or applied-research experience, or a strong quantitative background (physics, mathematics, or quantitative finance research)
Ability to write clean, reliable code and solid software engineering skills
Clear communication of complex ideas to diverse audiences
Hunger to learn and grow regardless of years of experience
At least a Bachelor's degree in a related field or equivalent experience (education requirement)

Strong candidates may also have

Experience with tool use / agentic safety, trust & safety, or security
Experience with reinforcement learning techniques and environments
Experience with language model training, fine-tuning, or evaluation
Experience building AI agents or autonomous systems
Published influential work in relevant ML areas, especially LLM safety and alignment
Deep expertise in a specialized area (RL, security, or mathematical foundations)
Experience shipping features or working closely with product teams
Enthusiasm for pair programming and collaborative research

Logistics

Locations: San Francisco, CA and New York City, NY
Location-based hybrid policy: staff are expected to be in an office at least 25% of the time (some roles may require more)
Visa sponsorship: Anthropic does sponsor visas and retains an immigration lawyer, though not every role/candidate can be successfully sponsored
Guidance on candidates' AI usage and additional company policies are provided by Anthropic

How we're different

Anthropic values big-science, high-impact AI research done as a cohesive team on a few large-scale efforts. The organization emphasizes empirical approaches, collaborative research discussions, and communication skills. Recent research directions include topics such as interpretability, scaling laws, learning from human preferences, and concrete AI safety problems.

Benefits

Anthropic offers competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office spaces for collaboration. The company encourages applicants from diverse backgrounds and welcomes candidates who may not meet every listed qualification.