Software Engineer, Model Inference

at OpenAI

📍 San Francisco, United States

USD 325,000-490,000 per year

MIDDLE

✅ On-site

✅ Relocation

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Distributed Systems @ 1 Machine Learning @ 3 Azure @ 3 Debugging @ 1 PyTorch @ 2 CUDA @ 2 GPU @ 3

Details

Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our state-of-the-art AI models. We focus on performant and efficient model inference and on accelerating research progression via model inference.

Responsibilities

Work alongside machine learning researchers, engineers, and product managers to bring latest technologies into production.
Enable advanced research through engineering collaboration with researchers.
Introduce new techniques, tools, and architectures that improve performance, latency, throughput, and efficiency of the model inference stack.
Build tools to give visibility into bottlenecks and sources of instability and design/implement solutions for the highest priority issues.
Optimize code and a fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM.

Requirements

At least 5 years of professional software engineering experience.
Understanding of modern ML architectures and intuition for optimizing their performance, particularly for inference.
Familiarity or ability to quickly gain familiarity with PyTorch, NVIDIA GPUs and the software stacks that optimize them (for example NCCL, CUDA).
Knowledge of HPC technologies such as InfiniBand, MPI, NVLink, etc.
Experience architecting, building, observing, and debugging production distributed systems (bonus for performance-critical distributed systems).
Experience rebuilding or substantially refactoring production systems due to rapidly increasing scale.
Ability to own problems end-to-end and pick up missing knowledge as needed.
Self-directed, humble, collaborative, and committed to team success.

Benefits

Base pay range listed below; total compensation may include equity and performance-related bonuses.
Medical, dental, and vision insurance with employer contributions to Health Savings Accounts.
Pre-tax accounts (Health FSA, Dependent Care FSA, commuter benefits).
401(k) with employer match.
Paid parental, medical, and caregiver leave; flexible PTO/paid time off and company holidays.
Mental health and wellness support; employer-paid basic life and disability coverage.
Annual learning and development stipend; daily meals in offices and meal delivery credits as eligible.
Relocation support for eligible employees.

Additional notes

Background checks will be administered in accordance with applicable law.
OpenAI is an equal opportunity employer and provides reasonable accommodations for applicants with disabilities.