Research Engineer, Pre-Training

at Anthropic

📍 New York City, United States
📍 San Francisco, United States
📍 Seattle, United States

USD 340,000-425,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Kubernetes @ 2 Python @ 3 ETL @ 3 Algorithms @ 3 Machine Learning @ 3 Communication @ 3 PyTorch @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We are seeking a Research Engineer to join our Pre-training team to help develop the next generation of large language models. This role sits at the intersection of research and engineering and contributes to building safe, steerable, and trustworthy AI systems.

Responsibilities

Conduct research and implement solutions in areas such as model architecture, algorithms, data processing, and optimizer development.
Independently lead small research projects and collaborate with team members on larger initiatives.
Design, run, and analyze scientific experiments to advance understanding of large language models.
Optimize and scale training infrastructure to improve efficiency and reliability.
Develop and improve developer tooling to enhance team productivity.
Contribute across the entire stack, from low-level optimizations to high-level model design.

Requirements

Advanced degree (MS or PhD) in Computer Science, Machine Learning, or a related field.
Strong software engineering skills with a proven track record of building complex systems.
Expertise in Python and experience with deep learning frameworks (PyTorch preferred).
Familiarity with large-scale machine learning, particularly in the context of language models.
Ability to balance research goals with practical engineering constraints.
Strong problem-solving skills and results-oriented mindset.
Excellent communication skills and ability to work collaboratively.
Concern for the societal impacts of AI work.

Preferred Experience

Work on high-performance, large-scale ML systems.
Familiarity with GPUs, Kubernetes, and OS internals.
Experience with language modeling using transformer architectures.
Knowledge of reinforcement learning techniques.
Background in large-scale ETL processes.

Sample Projects

Optimizing the throughput of novel attention mechanisms.
Comparing compute efficiency of different Transformer variants.
Preparing large-scale datasets for efficient model consumption.
Scaling distributed training jobs to thousands of GPUs.
Designing fault tolerance strategies for training infrastructure.
Creating interactive visualizations of model internals (e.g., attention patterns).

Logistics

Locations: San Francisco, CA; Seattle, WA; New York City, NY. Role is Remote-Friendly (travel required).
Location-based hybrid policy: staff are expected to be in one of Anthropic's offices at least 25% of the time.
Education: at least a Bachelor's degree in a related field or equivalent experience (MS/PhD preferred in qualifications).
Visa sponsorship: Anthropic does sponsor visas and retains an immigration lawyer to assist; sponsorship is considered on a role-by-role and candidate basis.

Compensation

Annual Salary: $340,000 - $425,000 USD

About Anthropic

Anthropic is focused on safe, ethical, and powerful AI. The company emphasizes big-science AI research, collaboration, and communication, and encourages applications from candidates of diverse backgrounds.

Benefits

Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office workspace for collaboration.