Research Engineer, Agents

at Anthropic

📍 New York City, United States
📍 San Francisco, United States
📍 Seattle, United States

USD 315,000-425,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Communication @ 3 LLM @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Agents team is focused on making Claude an even more effective agent by improving planning, reliable execution over long horizons, scaled tool use, memory, and inter-agent coordination. The role combines research and engineering work across finetuning, agent infrastructure, agent design best practices, automated evaluation, and product collaboration. Candidates are asked to share a project built on large language models (LLMs) demonstrating work on complex tasks (examples: agent design, prompting experiments, benchmarks, synthetic data generation, finetuning, or LLM applications).

Responsibilities

Finetune new capabilities into Claude to maximize performance or ease of use on agentic tasks
Ideate, develop, and compare performance of different tools for agents (e.g., memory, context compression, communication architectures)
Systematically discover and test prompt engineering best practices for agents
Develop automated techniques for designing and evaluating agentic systems
Assist with automated evaluation of Claude models and prompts across the training and product lifecycle
Work with product teams to solve challenges applying agents to products
Create and optimize data mixes for model training
Create and maintain infrastructure required for efficient prompt iteration and testing

Requirements

7+ years of ML and software engineering experience
At least a high-level familiarity with the architecture and operation of large language models
Extensive prior experience exploring and testing language model behavior
Experience prompting and/or building products with language models
Strong communication skills and interest in collaborative research
Passion for making powerful technology safe and societally beneficial
Stay current with emerging research and industry trends
Enjoy pair programming

Strong candidates may also have experience with:

Developing complex agentic systems using LLMs
Large-scale reinforcement learning on language models
Multi-agent systems

Representative projects

Implementing and testing a novel retrieval, tool use, sub-agent, or memory architecture for Claude
Finetuning Claude to maximize performance using a specific set of agent tools (e.g., read-write memory, inter-agent communication)
Building prompting and model orchestration for a production application backed by an LLM
Building and testing an automatic prompt optimizer or automatic LLM-driven evaluation system
Building a scaled model evaluation framework driven by model-based evaluation techniques

Logistics

Education: Bachelor's degree in a related field or equivalent experience required
Location-based hybrid policy: staff expected to be in office at least 25% of the time (some roles may require more)
Visa sponsorship: Anthropic does sponsor visas and will make reasonable efforts if an offer is made
Applicants are encouraged to apply even if they do not meet every qualification

How we're different

Anthropic focuses on large-scale, high-impact AI research, working as a cohesive team on a few large research efforts. The company values empirical science, collaboration, frequent research discussions, and strong communication.

Compensation & Benefits

Annual salary range provided below
Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible hours, and office space for collaboration

Application guidance

Applicants should share a project built on LLMs that demonstrates their ability to get LLMs to perform complex tasks; indicate personal contributions and optionally describe processes and roadblocks encountered.