Used Tools & Technologies
Machine LearningRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Communication @ 6
Prioritization @ 6
Project Management @ 6
AI @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial. The Human Data Platform team builds systems to collect data that improves models — including infrastructure to simulate real-world tasks, vendor interfaces, and pipelines to gather high-quality data at scale. You will work with engineering, research, data ops, and external vendors to prioritize and build tooling that scales as Claude's real-world usage evolves.
Responsibilities
- Own product direction for human data tooling, prioritizing labeling interfaces, infrastructure investments, data quality, and operational visibility
- Partner with engineering to scope and ship quickly in a fast-moving prototyping environment
- Develop deep understanding of research and training approaches to identify high-leverage tooling investments
- Identify patterns across one-off requests and push toward reusable infrastructure
- Sit in on crowd worker and vendor sessions to systematically understand pain points
- Define and track outcome-based KPIs (time-to-launch for new data collection projects, end-to-end data quality scores, measurable impact on model evaluation)
Requirements / Qualifications
- Experience shipping products where you must deeply understand technical constraints
- Experience working directly with research teams (ideally in AI/ML contexts)
- Comfortable talking to crowdworkers about workflow and to research teams about data quality methodology
- Interest in how humans interact with AI systems and designing experiences that elicit high-quality data
- Minimum: Bachelor's degree or equivalent combination of education, training, and/or experience
- Minimum years of experience: will correlate with internal job level requirements for the position
Strong Candidates May Also Have
- Experience building data collection tools, annotation platforms, or human-in-the-loop pipelines
- Experience working with researchers as internal users/customers
- Good instincts and an eye for intuitive user experiences for complex UI/annotation workflows
- Strong project management skills: prioritization and cross-org communication
Logistics
- Locations: San Francisco, CA and New York City, NY (United States)
- Location-based hybrid policy: staff expected to be in an office at least 25% of the time
- Visa sponsorship: Anthropic states they do sponsor visas and retain an immigration lawyer to assist where possible
Compensation
- Annual salary range: $305,000 - $385,000 USD
Benefits
- Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office space for collaboration
How We're Different
- Collaborative, research-driven environment focusing on large-scale, high-impact AI research directions (examples: interpretability, scaling laws, learning from human preferences)