Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Marketing @ 4
Security @ 4
Python @ 6
Scala @ 6
Spark @ 4
ETL @ 4
Java @ 6
Airflow @ 4
Flink @ 4
Data Science @ 4
Dagster @ 4
Data Engineering @ 4
Hadoop @ 4
Experimentation @ 4
ChatGPT @ 4
Compliance @ 4
AI @ 4
Data Pipelines @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
About the Team
The Statsig team at OpenAI builds and operates the experimentation platform that powers product development, measurement, and decision-making across the company. We partner closely with product, engineering, and infrastructure teams to ensure experiments are trustworthy, statistically rigorous, and scalable to the needs of frontier AI products.
Our mission is to help teams make better decisions through reliable experimentation. We care deeply about statistical correctness, pragmatic solutions, and building systems that researchers and engineers can trust at massive scale. The team operates at the intersection of experimentation methodology, data infrastructure, causal inference, and product analytics.
We are looking for experienced experimentation experts who want to shape the future of experimentation in the AI era.
About the role
We're seeking a Data Engineer to take the lead in building our data pipelines and core tables for OpenAI. These pipelines are crucial for powering analyses, safety systems that guide business decisions, product growth, and preventing bad actors. This role provides the opportunity to collaborate closely with the researchers behind ChatGPT and help them train new models to deliver to users. As we continue our rapid growth, we value data-driven insights, and your contributions will play a pivotal role in our trajectory.
This role is based in Bellevue and uses a hybrid work model that values in-person collaboration for technical design, iteration, and cross-functional partnership.
Responsibilities
- Design, build and manage data pipelines, ensuring all user event data is seamlessly integrated into the data warehouse.
- Develop canonical datasets to track key product metrics including user growth, engagement, and revenue.
- Work collaboratively with Infrastructure, Data Science, Product, Marketing, Finance, and Research to understand data needs and provide solutions.
- Implement robust and fault-tolerant systems for data ingestion and processing.
- Participate in data architecture and engineering decisions.
- Ensure the security, integrity, and compliance of data according to industry and company standards.
Requirements
- 3+ years of experience as a data engineer and 8+ years of any software engineering experience (including data engineering).
- Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java.
- Experience with distributed processing technologies and frameworks, such as Hadoop, Flink and distributed storage systems (e.g., HDFS, S3).
- Expertise with ETL schedulers such as Airflow, Dagster, Prefect or similar frameworks.
- Solid understanding of Spark and ability to write, debug and optimize Spark code.
Compensation Range
$293K - $325K USD
Benefits
- Medical, dental, and vision insurance with employer contributions to Health Savings Accounts
- Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses
- 401(k) retirement plan with employer match
- Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
- Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
- 13+ paid company holidays and multiple paid coordinated company office closures
- Mental health and wellness support
- Employer-paid basic life and disability coverage
- Annual learning and development stipend
- Daily meals in offices and meal delivery credits as eligible
- Relocation support for eligible employees
- Additional taxable fringe benefits such as charitable donation matching and wellness stipends
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. OpenAI is an equal opportunity employer and is committed to providing reasonable accommodations to applicants with disabilities.