Abuse Investigator (AI Self-Improvement Risk)

at OpenAI
USD 288,000-320,000 per year
MIDDLE
✅ Hybrid
✅ Relocation

Used Tools & Technologies

Not specified

Required Skills & Competences

Security @ 3 Python @ 2 SQL @ 2 Hiring @ 3 Leadership @ 3 AI @ 3

Details

OpenAI’s Intelligence & Investigations team identifies and investigates misuses of our products and emerging classes of risk to inform policies and safety mitigations. This role focuses on AI self-autonomy and agentic risk: identifying and investigating cases where models exhibit autonomous or agentic behavior (e.g., capability chaining, persistence, workaround behavior, or signs of self-improvement) and proposing mitigations.

Responsibilities

  • Review leads and investigate model behavior to identify agentic or autonomous patterns that introduce safety risk.
  • Detect and analyze behaviors such as multi-step planning, capability chaining, tool use, persistence, and workaround behavior.
  • Develop signals and tracking strategies to proactively identify emerging agentic risk patterns across the platform.
  • Identify gaps in existing safeguards, evaluations, or monitoring systems and propose improvements.
  • Communicate investigation findings clearly to technical, policy, and leadership stakeholders.
  • Collaborate effectively across teams and support a constructive working environment.

Requirements

  • Deep expertise investigating complex, adversarial, or emergent system behavior, ideally in AI safety, security, cyber, or trust & safety environments.
  • Strong familiarity with technical investigations using tools such as SQL and Python (or similar) in government, research, or technology settings.
  • Experience analyzing multi-step systems, automation, or agentic workflows and understanding how behaviors emerge across interactions.
  • At least 6 years of experience conducting investigations, threat analysis, or research in complex and ambiguous domains.
  • At least 2 years of experience helping to develop automated or scalable approaches to detection or investigation.
  • Experience identifying failure modes, unintended behaviors, or system-level risks, particularly in AI or software systems.
  • Experience presenting analytic work in technical, research, or policy settings; strong judgment and resilience in high-pressure environments.

Location & Workplace

  • This role is based in the San Francisco office. Workplace type: Hybrid. Investigations may involve reviewing complex or sensitive model behaviors and edge-case outputs.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. The company emphasizes safety, diverse perspectives, and inclusive hiring practices.

Benefits

  • Base pay range listed for this role; total compensation may include equity and performance-related bonuses.
  • Medical, dental, and vision insurance with employer HSA contributions.
  • Pre-tax Health FSA, Dependent Care FSA, and commuter accounts.
  • 401(k) with employer match.
  • Paid parental leave and medical/caregiver leave.
  • Flexible PTO (exempt) and up to 15 days annually for non-exempt employees; 13+ company holidays and paid office closures.
  • Mental health and wellness support; employer-paid basic life and disability coverage.
  • Annual learning and development stipend; daily meals in offices and meal delivery credits as eligible.
  • Relocation support for eligible employees.