Anthropic AI Safety Fellow, Canada

at Anthropic

📍 Canada

CAD 122,200 per year

CAD 59 per hour

MIDDLE

✅ Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 6 Machine Learning @ 2 Communication @ 3 Mathematics @ 6 API @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. This Fellowship is an external collaboration program aimed at accelerating empirical AI safety research by providing promising talent with mentorship, funding, compute, and workspace. The program will run for about 2 months with the possibility of extension for another 4 months, and Fellows are expected to produce a public output (for example, a paper submission) aligned with Anthropic’s research priorities.

Responsibilities

Use external infrastructure (e.g., open-source models, public APIs) to work on an empirical project aligned with Anthropic research priorities.
Produce a public output (e.g., paper submission) from the project work.
Receive and incorporate mentorship and feedback from Anthropic researchers.
Work full-time on the fellowship for the committed period (at least 2 months, ideally 6 months if extended).

What To Expect / Benefits

Direct mentorship from Anthropic researchers and connection to the broader AI safety research community.
Weekly stipend of 2,350 CAD and access to benefits (benefits vary by country but may include medical, dental, and vision insurance).
Funding for compute and other research expenses.
Access to a shared workspace and substantial support to develop research skills.
This role will be employed by a third-party talent partner and may be eligible for benefits through the employer of record.

Mentors & Research Areas

Fellows will be matched with mentors from Anthropic (examples listed include Ethan Perez, Jan Leike, Emmanuel Ameisen, Jascha Sohl-Dickstein, and others).
Example research areas include:
- Scalable Oversight
- Adversarial Robustness and AI Control
- Model Organisms
- Model Internals / Mechanistic Interpretability
- AI Welfare

Requirements

Motivation to reduce catastrophic risks from advanced AI systems.
Strong technical background in computer science, mathematics, physics, or related fields (or equivalent experience).
Strong programming skills, particularly in Python; comfortable programming in Python.
Familiarity with machine learning frameworks and empirical ML research workflows is valued.
Ability to work full-time on the fellowship (expectation: 40 hours per week).
Ability to obtain work authorization for US, UK, or Canada and ability to work full-time out of Berkeley or London (or remotely if in Canada). (Note: Anthropic cannot sponsor visas for every role; they can support Fellows on F-1 visas eligible for full-time OPT/CPT.)
Thrive in fast-paced, collaborative environments and able to execute projects independently while incorporating feedback.
At least a Bachelor's degree in a related field or equivalent experience is required.

Strong Candidates May Also Have

Experience with empirical ML research projects.
Experience working with Large Language Models.
Experience in research areas such as interpretability.
Experience with deep learning frameworks and experiment management.
Track record of open-source contributions.

Candidates Need Not Have

100% of the listed skills or formal certifications/education credentials; Anthropic encourages applicants from diverse backgrounds.

Interview Process

Rolling application and interview process with application requested by August 17 for the October cohort.
Stages include: initial application and references, a 90-minute Python coding screen, a coding-based technical interview (55 minutes, no ML), final interviews (Research Discussion and a take-home project with review), and reference checks.
Offers aim to be extended by early October; extension decisions for longer fellowships will be made in mid-December.

Compensation (CAD)

Weekly stipend: 2,350 CAD/week with an expectation of 40 hours/week.

Role-Specific Location Policy

This role is exempt from the general 25% in-office expectation and can be done remotely from anywhere in Ontario or British Columbia, Canada.

Logistics / Additional Notes

The program duration is ~2 months with possible extension to a total of ~6 months based on progress.
Anthropic emphasizes collaborative, large-scale research efforts and values communication skills in addition to technical strength.
Guidance on candidates' AI usage in the application process is provided by Anthropic.