Machine Learning Systems Engineer, Research Tools

at Anthropic

📍 New York City, United States
📍 San Francisco, United States
📍 Seattle, United States

USD 320,000-405,000 per year

MIDDLE

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 2 Algorithms @ 3 Distributed Systems @ 3 Machine Learning @ 3 Communication @ 3 Performance Optimization @ 3 Debugging @ 3

Details

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Role overview

We are seeking an experienced Machine Learning Systems Engineer to join our Encodings and Tokenization team. This cross-functional role will be instrumental in developing and optimizing the encodings and tokenization systems used throughout our finetuning workflows. As a bridge between Pretraining and Finetuning teams, you will build infrastructure that directly impacts how models learn from and interpret data, enabling more efficient and effective training while helping ensure models remain reliable, interpretable, and steerable.

Responsibilities

Design, develop, and maintain tokenization systems used across Pretraining and Finetuning workflows
Optimize encoding techniques to improve model training efficiency and performance
Collaborate closely with research teams to understand evolving needs around data representation
Build infrastructure that enables researchers to experiment with novel tokenization approaches
Implement systems for monitoring and debugging tokenization-related issues in the model training pipeline
Create robust testing frameworks to validate tokenization systems across diverse languages and data types
Identify and address bottlenecks in data processing pipelines related to tokenization
Document systems thoroughly and communicate technical decisions clearly to stakeholders across teams

Requirements

Significant software engineering experience with demonstrated machine learning expertise
Proficiency in Python and familiarity with modern ML development practices
Experience with machine learning systems, data pipelines, or ML infrastructure
Strong analytical skills and ability to evaluate the impact of engineering changes on research outcomes
Ability to work independently and collaboratively in rapidly evolving research environments
Bachelor's degree in a related field or equivalent experience (required)
Expectation to be in one of Anthropic's offices at least ~25% of the time (location-based hybrid policy)

Strong Candidates May Also Have Experience With

Working with machine learning data processing pipelines
Building or optimizing data encodings for ML applications
Implementing or working with BPE, WordPiece, or other tokenization algorithms
Performance optimization of ML data processing systems
Multi-language tokenization challenges and solutions
Distributed systems and parallel computing for ML workflows
Large language models or other transformer-based architectures (not required)

Compensation & Logistics

Annual base salary range: $320,000 - $405,000 USD. Total compensation includes equity, benefits, and may include incentive compensation. Visa sponsorship is available in some cases. Applications are reviewed on a rolling basis.

Benefits & Culture

Anthropic is a public benefit corporation headquartered in San Francisco. They offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and collaborative office spaces. The company values communication, collaboration, and impact-driven research. They encourage applications from candidates who may not meet every listed qualification.