Research Engineer, Tokens (Pre-Training)

USD 315,000-340,000 per year
MIDDLE
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

ETL @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. The team is composed of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Role overview

You will be responsible for pretraining data research. Work may include understanding pretraining data trends and scaling laws, optimizing pretraining data mixes, investigating potential new sources of data, building research tools to better understand experimental results, and figuring out how to process and use pretraining data most effectively.

Representative projects / Responsibilities

  • Compare the compute efficiency of different datasets.
  • Create multimodal datasets in formats models can easily consume.
  • Scale data processing jobs to thousands of machines.
  • Design research tools to analyze and manage data ablation experiments.
  • Create interactive visualizations of semantic clusters in training data.
  • Conduct empirical research into pretraining data mixes, scaling laws, and data sources.
  • Build tooling and infrastructure to process, inspect, and iterate on large pretraining datasets.

Requirements

  • Significant software engineering experience.
  • Comfort working in a very empirical research environment and focusing on impact.
  • Willingness to work flexibly and beyond strict job boundaries when needed.
  • Interest in and care about societal impacts of AI systems.

Strong candidates may also have experience with:

  • High performance, large-scale ML systems.
  • Language modeling with transformers.
  • Large-scale ETL (extract, transform, load) for training data.
  • Designing ML experiments and researching ML fundamentals.
  • Inspecting and iterating on data (examples given: ML competitions, Quantitative Finance).

Education: At least a Bachelor's degree in a related field or equivalent experience.

Logistics / Additional details

  • Locations listed: Remote-Friendly (travel required); San Francisco, CA; New York City, NY.
  • Location-based hybrid policy: staff are expected to be in one of the offices at least 25% of the time (some roles may require more time in offices).
  • Visa sponsorship: Anthropic does sponsor visas and retains an immigration lawyer; sponsorship may not be possible for every role/candidate but reasonable efforts will be made if an offer is extended.
  • The company encourages candidates to apply even if they do not meet every qualification listed and emphasizes diversity and inclusion.

Benefits

  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours and collaborative office spaces.
  • Guidance on acceptable AI usage in the application process.