Software Engineer, ML Systems & Training Architecture

at OpenAI
USD 295,000-380,000 per year
SENIOR
✅ On-site
✅ Relocation

Used Tools & Technologies

Machine Learning GPU

Required Skills & Competences

Distributed Systems @ 4 Networking @ 4 AI @ 4 Robotics @ 4

Details

About the Team

The OpenAI Robotics team is focused on unlocking general-purpose robotics and pushing towards AGI-level intelligence in dynamic, real-world settings. Working across the entire model stack, the team integrates cutting-edge hardware and software to explore a broad range of robotic form factors and blend high-level AI capabilities with the constraints of physical systems.

About the Role

As a Senior Software Engineer, ML Systems & Training Infrastructure, you will be a deeply hands-on engineering force multiplier for the robotics team. You will help keep the training framework and surrounding infrastructure healthy, review and improve code quickly, debug failures across ML systems and infrastructure, and unblock researchers and engineers when the path from idea to working training job gets rough. The role is based in San Francisco, CA and is expected in-office 5 days per week. Relocation assistance is offered to new employees.

Responsibilities

  • Review, improve, and clean up code across training frameworks and adjacent infrastructure.
  • Identify risky or low-quality changes before they land, and raise the code quality bar without slowing the team down.
  • Debug issues across ML training systems, GPUs, clusters, networking, and related infrastructure.
  • Help researchers and engineers unblock broken training jobs, flaky workflows, and brittle internal tooling.
  • Improve the reliability, maintainability, and usability of the robotics team’s training framework.
  • Move quickly on practical engineering problems that directly affect team velocity.

Requirements

  • Strong software engineering fundamentals and excellent code review judgment.
  • Experience with ML systems, training frameworks, GPUs, distributed systems, infrastructure, or similarly complex technical environments.
  • Ability to read and debug unfamiliar codebases quickly and get to root cause.
  • Ability to ship high-quality code with strong velocity and pragmatic judgment.
  • Comfortable working as a hands-on individual contributor focused on enabling researchers and engineers.
  • Experience reviewing messy, fast-moving, or AI-generated codebases.

Compensation

  • Compensation Range: $295K - $380K USD
  • Offers equity. Base pay may vary depending on market location, skills, and experience. Total compensation may include equity, performance-related bonuses, and other components.

Benefits

  • Medical, dental, and vision insurance with employer contributions to Health Savings Accounts.
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses.
  • 401(k) retirement plan with employer match.
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents); paid medical and caregiver leave (up to 8 weeks).
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees.
  • 13+ paid company holidays and multiple coordinated company office closures.
  • Mental health and wellness support; employer-paid basic life and disability coverage.
  • Annual learning and development stipend.
  • Daily meals in offices and meal delivery credits as eligible.
  • Relocation support for eligible employees.

Other Notes

  • Background checks will be administered in accordance with applicable law.
  • OpenAI is an equal opportunity employer and committed to reasonable accommodations for applicants with disabilities.