Research Engineer, Pre-Training

USD 340,000-425,000 per year
MIDDLE
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kubernetes @ 2 Python @ 3 ETL @ 3 Algorithms @ 3 Machine Learning @ 3 Communication @ 3 PyTorch @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. This role sits on the Pre-training team and focuses on developing the next generation of large language models, working at the intersection of research and practical engineering to build safe, steerable, and trustworthy AI systems.

Responsibilities

  • Conduct research and implement solutions in areas such as model architecture, algorithms, data processing, and optimizer development
  • Independently lead small research projects while collaborating with team members on larger initiatives
  • Design, run, and analyze scientific experiments to advance understanding of large language models
  • Optimize and scale training infrastructure to improve efficiency and reliability
  • Develop and improve developer tooling to enhance team productivity
  • Contribute across the entire stack, from low-level optimizations to high-level model design

Requirements

  • Advanced degree (MS or PhD) in Computer Science, Machine Learning, or a related field
  • Strong software engineering skills with a proven track record of building complex systems
  • Expertise in Python
  • Experience with deep learning frameworks (PyTorch preferred)
  • Familiarity with large-scale machine learning, particularly for language models
  • Ability to balance research goals with practical engineering constraints
  • Strong problem-solving skills and results-oriented mindset
  • Excellent communication skills and ability to work collaboratively
  • Consideration of the societal impacts of AI work

Preferred Experience

  • Work on high-performance, large-scale ML systems
  • Familiarity with GPUs, Kubernetes, and OS internals
  • Experience with language modeling using transformer architectures
  • Knowledge of reinforcement learning techniques
  • Background in large-scale ETL and data processing pipelines

Sample Projects

  • Optimizing the throughput of novel attention mechanisms
  • Comparing compute efficiency of different Transformer variants
  • Preparing large-scale datasets for efficient model consumption
  • Scaling distributed training jobs to thousands of GPUs
  • Designing fault-tolerance strategies for training infrastructure
  • Creating interactive visualizations of model internals, such as attention patterns

Logistics & Office Policy

  • Remote-Friendly (travel required). Offices in San Francisco, CA; Seattle, WA; and New York City, NY
  • Location-based hybrid policy: staff expected to be in one of our offices at least ~25% of the time (some roles may require more office time)
  • Visa sponsorship: Anthropic states they sponsor visas and will make reasonable efforts where possible

Compensation

Annual Salary: $340,000 - $425,000 USD

Total compensation for full-time employees may include equity, benefits, and incentive compensation.

Inclusion

Anthropic is committed to fostering a diverse and inclusive workplace and encourages applications from candidates of all backgrounds.