Anthropic AI Safety Fellow, Canada

at Anthropic

📍 Canada

CAD 122,200 per year

CAD 58 per hour

INTERN JUNIOR MIDDLE SENIOR

✅ Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 5 Machine Learning @ 5 Mathematics @ 6 API @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems to ensure AI safety and benefit for users and society. The team includes researchers, engineers, policy experts, and business leaders focused on building beneficial AI systems.

Responsibilities

Participate in the Anthropic Fellows Program, a 2-month (possibly extendable to 6 months) external collaboration to accelerate AI safety research.
Use external infrastructure such as open-source models and public APIs for empirical projects aligned with Anthropic's research priorities.
Produce public outputs like paper submissions.
Receive mentorship from Anthropic researchers, funding, compute resources, and access to shared workspace.

Mentors & Research Areas

Mentors include notable researchers such as Ethan Perez, Jan Leike, Emmanuel Ameisen, and others. Research focuses include:

Scalable Oversight to keep models helpful and honest.
Adversarial Robustness and AI Control.
Model Organisms for understanding alignment failures.
Model Internals / Mechanistic Interpretability.
AI Welfare and related evaluations and mitigations.

Requirements

Motivated to reduce catastrophic risks from advanced AI.
Interested in transitioning to full-time empirical AI safety research.
Strong technical background in computer science, mathematics, physics, or related fields.
Proficient in Python and machine learning frameworks.
Commitment to full-time fellowship for at least 2 months, ideally 6 months.
Eligible to work in Canada (Ontario or British Columbia) and can work remotely within these provinces.
Comfortable programming in Python.
Ability to work independently and collaboratively.

Strong Candidates May Also Have

Experience with empirical ML research.
Experience with large language models.
Experience in AI safety research areas like Interpretability.
Experience with deep learning frameworks and experiment management.
Open-source contributions.

Benefits

Weekly stipend of 2,350 CAD.
Access to benefits, including medical, dental, and vision insurance (vary by country).
Funding for compute and research expenses.
Role employed via a third-party talent partner.
Remote work allowed within Ontario or British Columbia.

Interview Process

Initial application and references.
90-minute coding assessment in Python.
Technical interview without ML (coding focused).
Final interviews including research discussion and take-home project.
Offer decisions by early October with rolling extensions.

Location and Office Policy

Remote work allowed from Ontario or British Columbia.
Exempt from on-site policy requiring 25% office presence.

Additional Information

The role requires at least a bachelor's degree or equivalent experience.
Anthropic encourages applications from underrepresented groups and does not require every qualification to be met fully.
Visa sponsorship is not provided for Fellows.
Anthropic values big science, collaboration, and impact in AI research.