Staff Machine Learning Engineer, Virtual Collaborator

at Anthropic

📍 San Francisco, United States

USD 340,000-560,000 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 4 Machine Learning @ 4 Communication @ 4 Slack @ 4 API @ 4

Details

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Role description

We are looking for a Machine Learning Engineer to help train Claude specifically for virtual collaborator workflows. While Claude excels at general tasks, many knowledge work problems require targeted training on real organizational data and workflows. This role involves designing and implementing reinforcement learning environments and data pipelines to make Claude an effective virtual collaborator across productivity, organizational navigation, and vertical domains.

Responsibilities

Design and implement reinforcement learning pipelines specifically targeted at virtual collaborator use cases (productivity, organizational navigation, vertical domains)
Build and scale the data creation platform for generating high-quality, open-ended tasks with domain experts and crowdworkers
Integrate real organizational data to create authentic training environments
Develop robust rubric-based evaluation systems that maintain quality while avoiding reward hacking
Train Claude on advanced document manipulation, including understanding, enhancing, and co-creating
Partner directly with product teams to ensure training aligns with shipped features

Requirements

Very experienced Python programmer who can quickly produce reliable, high-quality code
Strong machine learning experience
Comfortable working at the intersection of research and product and balancing research rigor with shipping deadlines
Experience collaborating across multiple teams (data operations, model training, product)
Ability to context-switch between research problems and product engineering tasks
Interest in making AI helpful for everyday enterprise workflows

Strong candidates may also have

Experience building human-in-the-loop training systems or crowdsourcing platforms
Experience working with enterprise tools and APIs (Google Workspace, Microsoft Office, Slack, etc.)
Experience developing evaluation frameworks for open-ended tasks
Domain expertise in finance, legal, or healthcare workflows
Experience creating scalable data pipelines with quality control mechanisms
Experience with reward modeling and preventing reward hacking in RL systems
Experience translating product requirements into technical training objectives

Compensation

Annual Salary: $340,000 - $560,000 USD

Our total compensation package for full-time employees includes equity, benefits, and may include incentive compensation.

Logistics & Additional Information

Education requirements: At least a Bachelor's degree in a related field or equivalent experience
Location-based hybrid policy: We expect staff to be in one of our offices at least 25% of the time (some roles may require more time in-office)
Visa sponsorship: We do sponsor visas where possible and retain an immigration lawyer to assist
Deadline to apply: None (applications reviewed on a rolling basis)

How we're different

Anthropic focuses on large-scale research efforts as a cohesive team and values impact and collaboration. We view AI research as an empirical science and host frequent research discussions. We value communication skills and diverse perspectives.

How to apply

Follow the application form on the job posting. The listing requests either a Resume/CV or LinkedIn profile and includes optional questions about relocation, visa needs, and other preferences. Read the candidate AI usage policy before applying (link in the job posting).