Engineering Manager, Safeguards Review Tooling

at Anthropic

📍 San Francisco, United States

USD 405,000-485,000 per year

MIDDLE

✅ Hybrid

✅ Visa Sponsorship

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Data Science @ 3 Hiring @ 3 Communication @ 3 Fraud @ 3 Engineering Management @ 3 LLM @ 3 Compliance @ 3 AI @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Safeguards team ensures models and products are developed and deployed safely. The Review Tooling team builds systems that humans — and increasingly Claude — use to investigate potential harms and take enforcement actions across Anthropic's first-party products and third-party cloud platforms.

Role overview

This is a foundational engineering manager role owning the tools safety investigators rely on to understand platform activity and take enforcement actions. The platform includes analytics capabilities, privacy-preserving data access primitives, and a sandbox environment for rapid development of review interfaces and workflows. As usage and model capabilities grow, you'll drive scaling of review via automation and integration of Claude where appropriate, keeping humans in the loop where judgment matters.

Responsibilities

Lead, grow, and develop a team of engineers building investigation, review, and enforcement tooling for first-party and third-party platforms
Define vision and roadmap for the review tooling platform, including analytics, privacy-compatible data access, and a sandbox for new review interfaces
Drive strategy for scaling review through automation and enabling reviewers to use Claude effectively (Claude-assisted and Claude-driven workflows)
Partner with policy, operations, legal, privacy, and data science stakeholders to translate enforcement and investigation needs into reliable, well-designed systems
Ensure tooling evolves alongside privacy primitives and data retention commitments
Create clarity for the team and stakeholders in an ambiguous and evolving environment
Hire, coach, and maintain a high-performing, inclusive engineering team
Contribute to engineering-wide initiatives as a member of Anthropic's engineering management community

Requirements

Minimum qualifications:

Experience managing software engineering teams (hiring, coaching, developing engineers)
Technical background in full-stack or platform engineering, able to engage in architecture and design discussions
Experience shipping internal tools or platforms for demanding operational users, with measurable workflow improvements
Experience working cross-functionally with non-engineering partners (operations, policy, legal)
Excellent communication skills and ability to explain technical tradeoffs to non-technical stakeholders
Motivation to work on societal impacts of AI and make powerful systems safer

Preferred qualifications:

4+ years of management experience, 10+ years of industry software engineering experience
Experience building trust & safety, integrity, fraud, or abuse-prevention tooling or systems supporting human review at scale
Experience designing systems under strict privacy, compliance, or data governance constraints (e.g., zero data retention environments)
Experience integrating LLMs or agentic systems into operational workflows and building human-in-the-loop automation
Experience building developer platforms or extensible tooling frameworks
Experience supporting enforcement or moderation systems across multiple product surfaces, including enterprise or cloud platform contexts

Compensation

Annual Salary: $405,000 - $485,000 USD

Logistics

Minimum education: Bachelor’s degree or equivalent combination of education/training/experience
Location: San Francisco, CA (location-based hybrid policy: staff expected to be in an office at least 25% of the time)
Visa sponsorship: Anthropic states they sponsor visas and retain an immigration lawyer to assist (subject to role/candidate)

Benefits

Anthropic offers competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and an office space for collaboration.

Additional context

The role involves close collaboration with policy, operations, data science, and legal teams.
The team will build analytics, privacy-preserving primitives, sandboxes, and automated workflows that may integrate Claude and other LLM/agentic systems.
Anthropic emphasizes inclusive hiring and encourages candidates to apply even if they do not meet every qualification listed.