Staff Machine Learning Engineer, Virtual Collaborator
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 4 Machine Learning @ 4 Communication @ 4 Slack @ 4 API @ 4Details
About Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
Role description
We are looking for a Machine Learning Engineer to help train Claude specifically for virtual collaborator workflows. While Claude excels at general tasks, many knowledge work problems require targeted training on real organizational data and workflows. This role involves designing and implementing reinforcement learning environments and data pipelines to make Claude an effective virtual collaborator across productivity, organizational navigation, and vertical domains.
Responsibilities
- Design and implement reinforcement learning pipelines specifically targeted at virtual collaborator use cases (productivity, organizational navigation, vertical domains)
 - Build and scale the data creation platform for generating high-quality, open-ended tasks with domain experts and crowdworkers
 - Integrate real organizational data to create authentic training environments
 - Develop robust rubric-based evaluation systems that maintain quality while avoiding reward hacking
 - Train Claude on advanced document manipulation, including understanding, enhancing, and co-creating
 - Partner directly with product teams to ensure training aligns with shipped features
 
Requirements
- Very experienced Python programmer who can quickly produce reliable, high-quality code
 - Strong machine learning experience
 - Comfortable working at the intersection of research and product and balancing research rigor with shipping deadlines
 - Experience collaborating across multiple teams (data operations, model training, product)
 - Ability to context-switch between research problems and product engineering tasks
 - Interest in making AI helpful for everyday enterprise workflows
 
Strong candidates may also have
- Experience building human-in-the-loop training systems or crowdsourcing platforms
 - Experience working with enterprise tools and APIs (Google Workspace, Microsoft Office, Slack, etc.)
 - Experience developing evaluation frameworks for open-ended tasks
 - Domain expertise in finance, legal, or healthcare workflows
 - Experience creating scalable data pipelines with quality control mechanisms
 - Experience with reward modeling and preventing reward hacking in RL systems
 - Experience translating product requirements into technical training objectives
 
Compensation
Annual Salary: $340,000 - $560,000 USD
Our total compensation package for full-time employees includes equity, benefits, and may include incentive compensation.
Logistics & Additional Information
- Education requirements: At least a Bachelor's degree in a related field or equivalent experience
 - Location-based hybrid policy: We expect staff to be in one of our offices at least 25% of the time (some roles may require more time in-office)
 - Visa sponsorship: We do sponsor visas where possible and retain an immigration lawyer to assist
 - Deadline to apply: None (applications reviewed on a rolling basis)
 
How we're different
Anthropic focuses on large-scale research efforts as a cohesive team and values impact and collaboration. We view AI research as an empirical science and host frequent research discussions. We value communication skills and diverse perspectives.
How to apply
Follow the application form on the job posting. The listing requests either a Resume/CV or LinkedIn profile and includes optional questions about relocation, visa needs, and other preferences. Read the candidate AI usage policy before applying (link in the job posting).