Research Engineer, Frontier Red Team (RSP Evaluations)

at Anthropic

📍 San Francisco, United States
📍 Seattle, United States

USD 315,000-425,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 3 Debugging @ 3 API @ 3 LLM @ 3

Details

Anthropic is building reliable, interpretable, and steerable AI systems. This production engineering role focuses on building and operating automated evaluation infrastructure that systematically tests frontier AI models for dangerous capabilities. You will create scalable, reliable evaluation pipelines, work closely with domain experts (e.g., biosecurity, cybersecurity), and operate systems during high-stakes model launches.

Responsibilities

Build and maintain automated evaluation systems using distributed infrastructure
Create robust evaluation pipelines that can run thousands of model capability tests
Develop tools that allow domain experts to quickly deploy new safety evaluations
Ensure evaluation systems run reliably during high-stakes model launches
Write production-quality Python code for evaluation infrastructure that scales
Monitor and operate evaluation systems during critical assessment periods
Collaborate with domain experts to translate safety requirements into technical implementations
Participate in high-stakes execution during model launch windows and operate monitoring/alerting systems

Requirements

Strong Python programming skills; ability to write clean, maintainable, production-quality code
Comfortable working with LLMs programmatically (APIs, prompting, output processing)
Experience debugging complex systems and resolving issues under pressure
Experience or familiarity with distributed infrastructure and containerized environments for production workloads
Knowledge of systems optimization and building efficient, scalable solutions
Experience building or working with evaluation frameworks, testing infrastructure, or automated assessment systems is highly relevant
Comfortable operating production systems and building monitoring/observability for evaluation pipelines
Interest in AI safety and responsible model development
Ability to work independently on 1–3 month projects while collaborating effectively with domain experts
Education: At least a Bachelor's degree in a related field or equivalent experience

Strong candidates may also have

Background in physics, systems engineering, or other fields requiring critical thinking about experimental results
Familiarity with adversarial testing, red-teaming, or finding edge cases in complex systems
Understanding of LLM capabilities and limitations
Experience with containerization and production operations

Representative projects

Build automated red-teaming systems that generate and evaluate thousands of adversarial prompts
Create evaluation pipelines that systematically test model capabilities across multiple risk domains
Develop monitoring infrastructure that tracks evaluation results and detects capability jumps
Implement reliable containerized environments for running large-scale model assessments
Build tools that allow biosecurity experts to quickly create and deploy new biological risk evaluations
Create automated analysis systems that process evaluation results and generate capability reports

Logistics & Benefits

Annual salary range: $315,000 - $425,000 USD
Location-based hybrid policy: staff expected to be in one of Anthropic's offices at least ~25% of the time (some roles may require more)
Visa sponsorship: Anthropic does sponsor visas in many cases and retains immigration counsel to assist
Education requirement: Bachelor's degree or equivalent experience
Benefits: competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, collaborative office spaces

Other notes

Deadline: None (applications reviewed on a rolling basis)
Team mission: Ensure AI systems remain safe and beneficial by building reliable evaluation and red-teaming infrastructure