Software Engineer, Data Infrastructure

at OpenAI
USD 210,000-405,000 per year
MIDDLE
βœ… Hybrid
βœ… Relocation

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 3 Kafka @ 3 Terraform @ 2 Spark @ 3 Airflow @ 3 Distributed Systems @ 3 Flink @ 3 Machine Learning @ 3 Debugging @ 3 Trino @ 3

Details

Data Platform at OpenAI owns the foundational data stack powering critical product, research, and analytics workflows. We operate very large Spark compute fleets in production; design and build data lakes and metadata systems on Iceberg and Delta with a vision toward exabyte-scale architecture; run high-throughput streaming platforms on Kafka and Flink; provide orchestration with Airflow; and support ML feature engineering tooling such as Chronon. The team’s mission is to deliver reliable, secure, and efficient data access at scale and accelerate intelligent, AI-assisted data workflows.

About the Role

This role focuses on building and operating data infrastructure that supports massive compute fleets and storage systems, designed for high performance and scalability. You will help design, build, and operate the next generation of data infrastructure at OpenAI: scale and harden big-data compute and storage platforms, build and support high-throughput streaming systems, build and operate low-latency data ingestions, enable secure and governed data access for ML and analytics, and design for reliability and performance at extreme scale. You will take full lifecycle ownership: architecture, implementation, production operations, and on-call participation.

This role is based in San Francisco, CA and uses a hybrid work model (3 days in office per week). The role is exclusively based in the San Francisco HQ and relocation assistance is offered to new employees.

Responsibilities

  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, and machine learning infrastructure while ensuring scalability, reliability, and security.
  • Ensure the data platform can scale by orders of magnitude while remaining reliable and efficient.
  • Accelerate company productivity by empowering engineers and teammates with excellent data tooling and systems.
  • Collaborate with product, research, and analytics teams to build technical foundation capabilities that unlock new features and experiences.
  • Own the reliability of the systems you build, including participation in an on-call rotation for critical incidents.

Requirements

  • 4+ years in data infrastructure engineering OR 4+ years in infrastructure engineering with a strong interest in data.
  • Experience supporting platforms such as Spark, Kafka, Flink, Airflow, Trino, Iceberg, or Delta.
  • Familiarity with infrastructure tooling such as Terraform.
  • Experience debugging large-scale distributed systems and designing for reliability, performance, and security at scale.
  • Comfortable with ambiguity and rapid change; willingness to learn missing skills and share learnings with others.
  • Ability to take full lifecycle ownership including architecture, implementation, production operations, and on-call responsibilities.

Technologies and Systems Mentioned

  • Spark, Kafka, Flink, Airflow, Trino, Iceberg, Delta, Chronon
  • Terraform
  • Distributed compute and storage systems, data lakes, metadata systems, streaming platforms, ML feature engineering tooling
  • High-throughput streaming, low-latency ingestion, security and governance for data access

Benefits

  • Base salary range listed for this role; offers equity and other compensation components.
  • Medical, dental, and vision insurance (employer contributions to HSAs where applicable).
  • Pre-tax accounts (Health FSA, Dependent Care FSA, commuter benefits).
  • 401(k) with employer match.
  • Paid parental leave and paid medical/caregiver leave.
  • Flexible PTO for exempt employees and up to 15 days annually for non-exempt employees.
  • 13+ paid company holidays and additional office closures.
  • Mental health and wellness support; employer-paid basic life and disability coverage.
  • Annual learning and development stipend.
  • Daily meals in offices and meal delivery credits as eligible.
  • Relocation support for eligible employees.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits all of humanity. The company emphasizes safety, diverse perspectives, and equal employment opportunity. Background checks and candidate accommodations processes are described in the posting.