Abuse Investigator (AI Self-Improvement Risk)

at OpenAI

📍 San Francisco, United States

USD 288,000-320,000 per year

MIDDLE

✅ Hybrid

✅ Relocation

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Security @ 3 Python @ 2 SQL @ 2 Hiring @ 3 Leadership @ 3 AI @ 3

Details

OpenAI’s Intelligence & Investigations team identifies and investigates misuses of our products and emerging classes of risk to inform policies and safety mitigations. This role focuses on AI self-autonomy and agentic risk: identifying and investigating cases where models exhibit autonomous or agentic behavior (e.g., capability chaining, persistence, workaround behavior, or signs of self-improvement) and proposing mitigations.

Responsibilities

Review leads and investigate model behavior to identify agentic or autonomous patterns that introduce safety risk.
Detect and analyze behaviors such as multi-step planning, capability chaining, tool use, persistence, and workaround behavior.
Develop signals and tracking strategies to proactively identify emerging agentic risk patterns across the platform.
Identify gaps in existing safeguards, evaluations, or monitoring systems and propose improvements.
Communicate investigation findings clearly to technical, policy, and leadership stakeholders.
Collaborate effectively across teams and support a constructive working environment.

Requirements

Deep expertise investigating complex, adversarial, or emergent system behavior, ideally in AI safety, security, cyber, or trust & safety environments.
Strong familiarity with technical investigations using tools such as SQL and Python (or similar) in government, research, or technology settings.
Experience analyzing multi-step systems, automation, or agentic workflows and understanding how behaviors emerge across interactions.
At least 6 years of experience conducting investigations, threat analysis, or research in complex and ambiguous domains.
At least 2 years of experience helping to develop automated or scalable approaches to detection or investigation.
Experience identifying failure modes, unintended behaviors, or system-level risks, particularly in AI or software systems.
Experience presenting analytic work in technical, research, or policy settings; strong judgment and resilience in high-pressure environments.

Location & Workplace

This role is based in the San Francisco office. Workplace type: Hybrid. Investigations may involve reviewing complex or sensitive model behaviors and edge-case outputs.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. The company emphasizes safety, diverse perspectives, and inclusive hiring practices.

Benefits

Base pay range listed for this role; total compensation may include equity and performance-related bonuses.
Medical, dental, and vision insurance with employer HSA contributions.
Pre-tax Health FSA, Dependent Care FSA, and commuter accounts.
401(k) with employer match.
Paid parental leave and medical/caregiver leave.
Flexible PTO (exempt) and up to 15 days annually for non-exempt employees; 13+ company holidays and paid office closures.
Mental health and wellness support; employer-paid basic life and disability coverage.
Annual learning and development stipend; daily meals in offices and meal delivery credits as eligible.
Relocation support for eligible employees.