Research Engineer, Agents

USD 315,000-425,000 per year
MIDDLE
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Communication @ 3 LLM @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Agents team is focused on making Claude an even more effective agent by improving planning, reliable execution over long horizons, scaled tool use, memory, and inter-agent coordination. The role combines research and engineering work across finetuning, agent infrastructure, agent design best practices, automated evaluation, and product collaboration. Candidates are asked to share a project built on large language models (LLMs) demonstrating work on complex tasks (examples: agent design, prompting experiments, benchmarks, synthetic data generation, finetuning, or LLM applications).

Responsibilities

  • Finetune new capabilities into Claude to maximize performance or ease of use on agentic tasks
  • Ideate, develop, and compare performance of different tools for agents (e.g., memory, context compression, communication architectures)
  • Systematically discover and test prompt engineering best practices for agents
  • Develop automated techniques for designing and evaluating agentic systems
  • Assist with automated evaluation of Claude models and prompts across the training and product lifecycle
  • Work with product teams to solve challenges applying agents to products
  • Create and optimize data mixes for model training
  • Create and maintain infrastructure required for efficient prompt iteration and testing

Requirements

  • 7+ years of ML and software engineering experience
  • At least a high-level familiarity with the architecture and operation of large language models
  • Extensive prior experience exploring and testing language model behavior
  • Experience prompting and/or building products with language models
  • Strong communication skills and interest in collaborative research
  • Passion for making powerful technology safe and societally beneficial
  • Stay current with emerging research and industry trends
  • Enjoy pair programming

Strong candidates may also have experience with:

  • Developing complex agentic systems using LLMs
  • Large-scale reinforcement learning on language models
  • Multi-agent systems

Representative projects

  • Implementing and testing a novel retrieval, tool use, sub-agent, or memory architecture for Claude
  • Finetuning Claude to maximize performance using a specific set of agent tools (e.g., read-write memory, inter-agent communication)
  • Building prompting and model orchestration for a production application backed by an LLM
  • Building and testing an automatic prompt optimizer or automatic LLM-driven evaluation system
  • Building a scaled model evaluation framework driven by model-based evaluation techniques

Logistics

  • Education: Bachelor's degree in a related field or equivalent experience required
  • Location-based hybrid policy: staff expected to be in office at least 25% of the time (some roles may require more)
  • Visa sponsorship: Anthropic does sponsor visas and will make reasonable efforts if an offer is made
  • Applicants are encouraged to apply even if they do not meet every qualification

How we're different

Anthropic focuses on large-scale, high-impact AI research, working as a cohesive team on a few large research efforts. The company values empirical science, collaboration, frequent research discussions, and strong communication.

Compensation & Benefits

  • Annual salary range provided below
  • Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible hours, and office space for collaboration

Application guidance

  • Applicants should share a project built on LLMs that demonstrates their ability to get LLMs to perform complex tasks; indicate personal contributions and optionally describe processes and roadblocks encountered.