Safeguards Enforcement Analyst, Safety Evaluations

USD 230,000-270,000 per year
MIDDLE
✅ Hybrid
✅ Visa Sponsorship

Used Tools & Technologies

Not specified

Required Skills & Competences

SQL @ 5 Scoping @ 3 Communication @ 6 Prioritization @ 6 Claude Code @ 3 AI @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Safeguards team enforces policies, protects users, and ensures the platform is not misused. This role focuses on Safety Evaluations — running and monitoring evaluations to ensure models meet safety and policy standards before and after launch, coordinating creation of new evals, driving mitigations, and building processes and documentation to scale evaluation work.

Responsibilities

  • Support model launch readiness by running evaluations, monitoring and interpreting results, and surfacing regressions or unexpected behavior changes to relevant stakeholders
  • Partner with policy and domain experts throughout the evaluation lifecycle — from identifying risks and scoping evaluation approaches to coordinating creation and ensuring evals remain current with evolving policies, threat vectors, and model capabilities
  • Work with cross-functional stakeholders to manage evaluation outcomes, interpret results, and drive mitigations where needed
  • Design processes and eval paradigms to keep evaluations high-signal and insightful as models improve
  • Build processes and frameworks for creating product-specific evaluations as Anthropic’s product surface expands
  • Help design and scope tooling improvements to support evolving eval needs and expand self-serve eval creation for non-technical users
  • Write and maintain rigorous documentation for evaluation creation, execution, and interpretation as eval tooling and processes are built out

Requirements / Qualifications

  • Experience in trust and safety, content operations, policy enforcement, or a related operational role at a technology company
  • Comfortable working in ambiguous, fast-moving environments and figuring out paths forward with incomplete information
  • Experience building processes, workflows, or programs from scratch (zero-to-one work)
  • Strong program management instincts: creating structure around complex, multi-stakeholder efforts, tracking timelines, dependencies, and deliverables
  • Eagerness to expand technical toolkit and adopt internal tools and AI-assisted workflows (e.g., Claude Code)
  • Ability to manage multiple concurrent workstreams across different domain areas with strong prioritization and context-switching
  • Strong written and cross-functional communication skills
  • We require at least a Bachelor’s degree in a related field or equivalent experience

Strong candidates may also have

  • Experience operating under tight, high-stakes timelines (product launches, incident response, regulatory deadlines)
  • Experience coordinating across engineering, policy, and product teams to translate findings into concrete action
  • Experience building and maintaining SOPs, runbooks, and operational documentation in fast-changing environments
  • Proficiency with data tools (SQL, dashboards, spreadsheets) sufficient to maintain and improve workflows
  • Comfort working with sensitive content areas as part of eval creation or enforcement review responsibilities

Compensation

  • Annual Salary: $230,000 - $270,000 USD

Logistics

  • Location-based hybrid policy: staff are expected to be in one of Anthropic’s offices at least 25% of the time; some roles may require more time in office
  • Remote-friendly (travel required)
  • Visa sponsorship: Anthropic does sponsor visas and retains an immigration lawyer, though sponsorship availability may vary by role/candidate

Benefits

  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours and pleasant office spaces

How we work

  • Collaborative, research-driven approach focused on large-scale research efforts and high impact
  • Frequent research discussions and an emphasis on communication and cross-functional collaboration