Software Engineer, Inference AMD GPU Enablement

at OpenAI

📍 San Francisco, United States

USD 325,000-490,000 per year

MIDDLE

✅ On-site

✅ Relocation

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Hiring @ 3 Communication @ 3 CUDA @ 6 GPU @ 3

Details

Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprises and developers alike to use and access our state-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.

About the role

We’re hiring engineers to scale and optimize OpenAI’s inference infrastructure across emerging GPU platforms. You’ll work across the stack — from low-level kernel performance to high-level distributed execution — and collaborate closely with research, infra, and performance teams to ensure our largest models run smoothly on new hardware. This is a high-impact opportunity to shape OpenAI’s multi-platform inference capabilities from the ground up with a particular focus on advancing inference performance on AMD accelerators.

Responsibilities

Own bring-up, correctness and performance of the OpenAI inference stack on AMD hardware.
Integrate internal model-serving infrastructure (e.g., vLLM, Triton) into a variety of GPU-backed systems.
Debug and optimize distributed inference workloads across memory, network, and compute layers.
Validate correctness, performance, and scalability of model execution on large GPU clusters.
Collaborate with partner teams to design and optimize high-performance GPU kernels for accelerators using HIP, Triton, or other performance-focused frameworks.
Collaborate with partner teams to build, integrate and tune collective communication libraries (e.g., RCCL) used to parallelize model execution across many GPUs.

Requirements

Experience writing or porting GPU kernels using HIP, CUDA, or Triton, and strong focus on low-level performance.
Familiarity with communication libraries like NCCL/RCCL and understanding of their role in high-throughput model serving.
Experience with distributed inference systems and comfortable scaling models across fleets of accelerators.
Comfortable solving end-to-end performance challenges across hardware, system libraries, and orchestration layers.
Ability to collaborate on building new infrastructure from first principles in a small, fast-moving team.

Nice to have

Contributions to open-source libraries like RCCL, Triton, or vLLM.
Experience with GPU performance tools (Nsight, rocprof, perf) and memory/comms profiling.
Prior experience deploying inference on non-NVIDIA GPU environments.
Knowledge of model/tensor parallelism, mixed precision, and serving 10B+ parameter models.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws.

Benefits & Compensation

Listed base salary range: $325,000 - $490,000 (additional equity and performance-related bonuses may apply).
Medical, dental, and vision insurance with employer HSA contributions.
Pre-tax health and dependent care FSAs and commuter accounts.
401(k) with employer match.
Paid parental and caregiver leave; flexible PTO and paid holidays.
Mental health and wellness support; employer-paid basic life and disability coverage.
Annual learning and development stipend; daily office meals and meal delivery credits as eligible.
Relocation support for eligible employees.

For more details on benefits and policies, candidates are referred to OpenAI's public policy and benefits materials.