Software Engineer, Productivity - Inference Runtime

at OpenAI

📍 San Francisco, United States

USD 230,000-385,000 per year

MIDDLE

✅ On-site

✅ Relocation

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Python @ 3 CI/CD @ 3 Distributed Systems @ 3 Hiring @ 3 Debugging @ 3 API @ 3 ChatGPT @ 3 GPU @ 3 Codex @ 3 Observability @ 6

Details

We’re hiring a Developer Productivity engineer to support OpenAI’s Inference Runtime teams. These teams own systems responsible for serving models reliably, efficiently, and safely across Codex, ChatGPT, API, and internal research workloads. This role sits at the intersection of developer experience, CI/CD infrastructure, release engineering, production readiness, and inference systems reliability. You will work on tooling and operational foundations that support model launches, inference optimizations, cloud provider integrations, and large-scale deployments across a rapidly evolving inference stack.

Responsibilities

Improve systems that ensure inference engine releases are correct, performant, and regression-free by evolving tooling and infrastructure for deploy gate validation.
Bring rigor to release, validation, branching, and deployment processes across the inference stack.
Improve canary, async, and large-scale validation workflows for inference systems.
Harden CI, testing, and validation infrastructure so failures are actionable and trustworthy.
Reduce noisy or flaky failures caused by infrastructure instability, GPU scheduling, or test environment issues.
Build automation for failure triage, ownership detection, debugging, and escalation.
Partner closely with inference teams, research developer productivity, engine acceleration, and infrastructure teams to improve release quality and rollout safety.
Reduce developer friction in testing, debugging, and release workflows to enable engineers to move faster with confidence.

Requirements

Strong experience with CI/CD systems, testing infrastructure, release tooling, developer productivity, or large-scale build and validation systems.
Comfortable working in Python-heavy environments and debugging complex distributed systems; C++ experience is helpful but not required.
Experience or strong interest in improving observability, rollout safety, release automation, and developer self-service tooling.
Ability to harden systems that catch issues before they reach production, reduce noise from flaky or infra-related test failures, and automate triage and escalation workflows.
High ownership, strong developer empathy, and comfort operating in ambiguous, cross-functional areas without a fully predefined roadmap.
Excited to learn about large-scale inference systems; prior inference experience is not required.

Benefits

Base pay range listed separately; total compensation may include equity and performance-related bonuses.
Medical, dental, and vision insurance with employer contributions to Health Savings Accounts.
Pre-tax accounts (Health FSA, Dependent Care FSA, commuter expenses).
401(k) retirement plan with employer match.
Paid parental leave and paid medical/caregiver leave; flexible PTO and paid company holidays.
Mental health and wellness support; employer-paid basic life and disability coverage.
Annual learning and development stipend; daily meals in offices and meal delivery credits as eligible.
Relocation support for eligible employees.
Additional fringe benefits (charitable donation matching, wellness stipends) as applicable.