Research Engineer, Agents

USD 315,000-425,000 per year
SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Machine Learning @ 7 Communication @ 4 LLM @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems aimed at being safe and beneficial for users and society. The team consists of researchers, engineers, policy experts, and business leaders working to build beneficial AI systems.

Responsibilities

  • Finetune new capabilities into Claude that maximize its performance or ease of use on agentic tasks
  • Develop and compare performance of different tools for agents (e.g., memory, context compression, communication architectures)
  • Systematically discover and test prompt engineering best practices for agents
  • Develop automated techniques for designing and evaluating agentic systems
  • Assist with automated evaluation of Claude models and prompts across the training and product lifecycle
  • Collaborate with the product organization to solve challenges applying agents to products
  • Create and optimize data mixes for model training
  • Maintain infrastructure for efficient prompt iteration and testing

Requirements

  • 7+ years of machine learning and software engineering experience
  • High-level familiarity with large language model (LLM) architecture and operation
  • Extensive experience exploring and testing language model behavior
  • Experience with prompting and/or building products using language models
  • Strong communication skills and interest in difficult research collaboration
  • Passion for making powerful technology safe and socially beneficial
  • Keeps updated on emerging research and industry trends
  • Enjoys pair programming

Preferred Experience

  • Developing complex agentic systems using LLMs
  • Large-scale reinforcement learning on language models
  • Multi-agent systems

Representative Projects

  • Implementing/testing novel architecture for retrieval, tool use, sub-agents, or memory for Claude
  • Finetuning Claude to maximize performance using agent tools like read-write memory or inter-agent communication
  • Building prompting and orchestration for production LLM applications
  • Creating automatic prompt optimizers or LLM-driven evaluation systems
  • Building scaled model evaluation frameworks using model-based evaluation techniques

Logistics

  • Education: Bachelor's degree in a related field or equivalent experience
  • Hybrid work policy: Expected to be in office at least 25% of the time; some roles may require more
  • Visa sponsorship available with reasonable effort

Benefits

  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Collaborative office space in San Francisco

Company Culture

  • Focus on big science AI research for high impact
  • Collaborative and cohesive team
  • Frequent research discussions valuing communication skills
  • Research directions aligned with major AI advancements and safety

AI Usage Guidance

  • Company's policy on AI usage during application process is provided.