Used Tools & Technologies
Machine Learning GPURequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Distributed Systems @ 4
Networking @ 4
AI @ 4
Robotics @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
About the Team
The OpenAI Robotics team is focused on unlocking general-purpose robotics and pushing towards AGI-level intelligence in dynamic, real-world settings. Working across the entire model stack, the team integrates cutting-edge hardware and software to explore a broad range of robotic form factors and blend high-level AI capabilities with the constraints of physical systems.
About the Role
As a Senior Software Engineer, ML Systems & Training Infrastructure, you will be a deeply hands-on engineering force multiplier for the robotics team. You will help keep the training framework and surrounding infrastructure healthy, review and improve code quickly, debug failures across ML systems and infrastructure, and unblock researchers and engineers when the path from idea to working training job gets rough. The role is based in San Francisco, CA and is expected in-office 5 days per week. Relocation assistance is offered to new employees.
Responsibilities
- Review, improve, and clean up code across training frameworks and adjacent infrastructure.
- Identify risky or low-quality changes before they land, and raise the code quality bar without slowing the team down.
- Debug issues across ML training systems, GPUs, clusters, networking, and related infrastructure.
- Help researchers and engineers unblock broken training jobs, flaky workflows, and brittle internal tooling.
- Improve the reliability, maintainability, and usability of the robotics team’s training framework.
- Move quickly on practical engineering problems that directly affect team velocity.
Requirements
- Strong software engineering fundamentals and excellent code review judgment.
- Experience with ML systems, training frameworks, GPUs, distributed systems, infrastructure, or similarly complex technical environments.
- Ability to read and debug unfamiliar codebases quickly and get to root cause.
- Ability to ship high-quality code with strong velocity and pragmatic judgment.
- Comfortable working as a hands-on individual contributor focused on enabling researchers and engineers.
- Experience reviewing messy, fast-moving, or AI-generated codebases.
Compensation
- Compensation Range: $295K - $380K USD
- Offers equity. Base pay may vary depending on market location, skills, and experience. Total compensation may include equity, performance-related bonuses, and other components.
Benefits
- Medical, dental, and vision insurance with employer contributions to Health Savings Accounts.
- Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses.
- 401(k) retirement plan with employer match.
- Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents); paid medical and caregiver leave (up to 8 weeks).
- Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees.
- 13+ paid company holidays and multiple coordinated company office closures.
- Mental health and wellness support; employer-paid basic life and disability coverage.
- Annual learning and development stipend.
- Daily meals in offices and meal delivery credits as eligible.
- Relocation support for eligible employees.
Other Notes
- Background checks will be administered in accordance with applicable law.
- OpenAI is an equal opportunity employer and committed to reasonable accommodations for applicants with disabilities.