Researcher, Alignment Oversight

at OpenAI

📍 San Francisco, United States

USD 250,000-445,000 per year

MIDDLE

✅ Hybrid

✅ Relocation

Used Tools & Technologies

Machine Learning LLM

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Security @ 3 Debugging @ 6 Codex @ 3 AI @ 3 Reinforcement Learning @ 3 Agentic AI @ 3

Details

About the Team

The Alignment Oversight team at OpenAI develops techniques for improving control, accountability, and alignment as AI systems become more capable and agentic. We combine longer-horizon research with hands-on deployment. We study long-term questions about how increasingly intelligent systems can be supervised, constrained, and corrected, while also building oversight systems that are used in practice today (examples: code review and action monitoring for codex).

We also study how to learn from real-world deployments: using oversight data and human interventions to train future models to be more aligned, while preserving the effectiveness and independence of the oversight systems themselves.

About the Role

As a researcher on the Alignment team, you will design and run experiments that improve our ability to oversee increasingly capable models. You will work on hands-on model training, evaluation design, and research infrastructure, and translating promising oversight ideas into systems that can operate on real model traffic and real user workflows.

This role combines longer-horizon research with shorter deployment sprints, with projects typically scoped around 3–6 month research timelines and aimed at directly improving future model behavior.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

Responsibilities

Design and implement alignment experiments focused on oversight systems for increasingly agentic AI models.
Deploy practical systems for action monitoring, red-teaming, and human-in-the-loop control.
Develop evaluations for alignment failure modes of the frontier models such as overeagerness, instruction following failures, covert actions, avoiding restrictions and scheming propensity.
Analyze deployment data to understand model failures, oversight gaps, and opportunities for training more aligned models.
Develop techniques for feeding oversight signals back into training while preserving the reliability and independence of the oversight process.
Produce externally publishable research when results advance the broader science of alignment.
Collaborate across research, product, security, safety, and engineering teams to turn alignment ideas into working systems.
Move quickly from research intuition to working experiments, prototypes, and evidence that can shape future models.

Requirements

Strong hands-on experience training, evaluating, or debugging large ML models, especially LLMs.
Experience with reinforcement learning, post-training, preference optimization, scalable oversight, model evaluation, or adjacent empirical ML research.
Strong engineering execution and ability to turn ambiguous research ideas into reliable experiments, tools, training pipelines, and production-facing systems.
Research intuitions for designing informative experiments while staying grounded in implementation details and empirical results.
Ability to work in fast-paced, collaborative research environments and to collaborate across functions.
Commitment to coupling safety and usefulness in AI systems.

Benefits

Base pay varies depending on market location, knowledge, skills, and experience. In addition to base pay, total compensation includes equity and potential performance-related bonuses.
Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts.
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit).
401(k) retirement plan with employer match.
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks).
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees.
13+ paid company holidays and multiple paid coordinated company office closures throughout the year, plus paid sick or safe time as required by law.
Mental health and wellness support; employer-paid basic life and disability coverage.
Annual learning and development stipend.
Daily meals in offices and meal delivery credits as eligible.
Relocation support for eligible employees.
Additional taxable fringe benefits such as charitable donation matching and wellness stipends.

Additional Information

Background checks will be administered in accordance with applicable law. OpenAI is an equal opportunity employer and provides reasonable accommodations to applicants with disabilities.