Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Linux @ 3
Python @ 6
Communication @ 3
Networking @ 3
Performance Optimization @ 3
Rust @ 2
Debugging @ 3
Observability @ 3
AI @ 3
Profiling @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
OpenAI’s Hardware organization develops custom silicon and system-level solutions for the unique demands of advanced AI workloads. The team works across hardware, systems architecture, and software to build infrastructure that enables high-performance, AI-native computing at scale. In close partnership with research, software, and external vendors, the team brings up new platforms, integrates emerging technologies, and develops the host-side systems software needed to make these systems performant, reliable, and production-ready.
Role summary
You will define and build the host software stack for next-generation AI systems, working close to the hardware on performance-critical software including Linux kernel drivers, high-throughput I/O paths, and system-scale networking and RDMA. The role spans architecture, implementation, platform bring-up, debugging, and performance optimization across hardware and software boundaries — from low-level device interfaces through userspace tooling and production validation.
Responsibilities
- Design, implement, and debug host-side systems software for AI infrastructure, including Linux kernel drivers and supporting userspace components.
- Build and optimize software paths for high-throughput, low-latency communication, including RDMA and related networking functionality.
- Develop software around PCIe, DMA, NICs, accelerators, memory movement, and device interaction.
- Bring up new hardware platforms and diagnose complex issues across kernel, firmware, networking, and hardware boundaries.
- Build tooling for integration, testing, diagnostics, observability, qualification, and performance characterization.
- Collaborate with hardware, networking, and platform teams to define interfaces and integrate new capabilities.
- Work with external vendors where needed to integrate technologies and drive issues to resolution.
- Contribute across the systems software stack as the platform and team evolve and help shape technical direction and engineering practices.
Requirements
- Experience building low-level or performance-critical systems software.
- Strong programming skills in C or C++, with proficiency using Python and Linux tooling for automation and debugging.
- Strong Linux systems fundamentals and the ability to debug across hardware and software boundaries.
- Hands-on experience in at least one relevant area, such as Linux kernel drivers, kernel networking, RDMA, PCIe, DMA, NIC software, accelerator software, or high-performance I/O.
- Experience owning complex software projects from design through implementation, bring-up, and validation.
- Ability to thrive in ambiguity, work across subsystem boundaries, and build systems from scratch.
- Strong cross-functional communication skills.
Preferred qualifications
- Experience developing Linux kernel drivers or other OS-level performance-critical components.
- Familiarity with RDMA or RoCE, ibverbs, kernel networking, or congestion-control concepts such as ECN and DCQCN.
- Experience with PCIe, DMA, peer-to-peer communication, SR-IOV, IOMMU, dma-buf, or related accelerator and I/O subsystems.
- Experience bringing up accelerators, NICs, SoCs, or custom hardware platforms.
- Experience profiling and optimizing high-throughput, low-latency systems.
- Familiarity with Rust or experience using Rust for systems programming.
Additional notes
- To comply with U.S. export control laws and regulations, candidates for this role may need to meet certain legal status requirements as provided in those laws and regulations.
Benefits
- Medical, dental, and vision insurance with employer contributions to Health Savings Accounts.
- Pre-tax accounts (Health FSA, Dependent Care FSA, commuter expenses).
- 401(k) retirement plan with employer match.
- Paid parental leave, paid medical and caregiver leave.
- Paid time off and 13+ paid company holidays with coordinated office closures.
- Mental health and wellness support; employer-paid basic life and disability coverage.
- Annual learning and development stipend; daily meals in offices and meal delivery credits as eligible.
- Relocation support for eligible employees; additional taxable fringe benefits (charitable donation matching, wellness stipends) may be provided.