Research Engineer, Virtual Collaborator

at Anthropic

📍 New York City, United States
📍 San Francisco, United States
📍 Seattle, United States

USD 315,000-560,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 3 Machine Learning @ 6 Communication @ 3 Slack @ 3 API @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society. The team includes researchers, engineers, policy experts, and business leaders working to build beneficial AI systems.

Responsibilities

Design and implement reinforcement learning pipelines focused on virtual collaborator use cases such as productivity, organizational navigation, and vertical domains.
Build and scale data creation platforms for generating high-quality, open-ended tasks involving domain experts and crowdworkers.
Integrate real organizational data to create authentic training environments.
Develop robust rubric-based evaluation systems to maintain quality and avoid reward hacking.
Train Claude on advanced document manipulation including understanding, enhancing, and co-creating.
Collaborate closely with product teams to align training with shipped features.

Requirements

Extensive Python programming experience with the ability to write reliable, high-quality code.
Strong machine learning research expertise, particularly in reinforcement learning and fine-tuning.
Ability to work at the intersection of research and product, solving real-world problems pragmatically.
Comfortable managing ambiguity and balancing research rigor with shipping deadlines.
Excellent collaboration skills across multiple teams (data operations, model training, product).
Ability to switch context between research and engineering tasks.
A commitment to making AI genuinely helpful for enterprise workflows.

Strongly Preferred Experience

Building human-in-the-loop training systems or crowdsourcing platforms.
Working with enterprise tools and APIs such as Google Workspace, Microsoft Office, Slack.
Developing evaluation frameworks for open-ended tasks.
Domain expertise in finance, legal, or healthcare workflows.
Creating scalable data pipelines with quality control.
Expertise with reward modeling and preventing reward hacking in reinforcement learning systems.
Translating product requirements into technical training objectives.

Benefits and Logistics

Annual salary range: $315,000 - $560,000 USD.
At least a Bachelor's degree in a related field or equivalent experience required.
Hybrid location policy requiring at least 25% in-office presence at offices in San Francisco, New York City, or Seattle.
Visa sponsorship is available with legal support.
Inclusive and diverse team culture encouraging applications from all qualified candidates.
Competitive compensation and benefits including optional equity donation matching, generous vacation and parental leave, flexible working hours, and collaborative office spaces.

About Anthropic

Anthropic is a public benefit corporation headquartered in San Francisco focused on high-impact AI research through large-scale collaborative efforts. They value communication and empirical science approaches, with research roots including GPT-3, interpretability, multimodal neurons, scaling laws, AI safety, and learning from human preferences.