Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 3 Grafana @ 3 Kubernetes @ 3 Prometheus @ 3 IaC @ 3 Terraform @ 3 Datadog @ 3 Communication @ 6 CloudFormation @ 3 Microservices @ 3 Design Patterns @ 3 Splunk @ 3 GPU @ 3Details
Join the engineering teams that bring OpenAI’s ideas safely to the world.
The Applied Engineering team works across research, engineering, product, and design to bring OpenAI’s technology to consumers and businesses. The team focuses on learning from deployment, distributing the benefits of AI, and ensuring safe, responsible usage. Safety is prioritized over unfettered growth.
This role is based exclusively at OpenAI’s San Francisco HQ and involves working in a deeply iterative, collaborative, fast-paced environment to ensure scalability, performance, and reliability for systems serving millions of users.
Responsibilities
- Design and implement solutions to ensure the scalability of infrastructure to meet rapidly increasing demands.
- Build and maintain load, chaos, and synthetic testing software used by development teams to improve system reliability.
- Build and maintain automation tools to streamline repetitive tasks and improve system reliability.
- Build and maintain platforms for CPU/storage, GPU, and network lifecycle management to drive efficiency, accountability, and dynamic optimization of resources.
- Implement fault-tolerant and resilient design patterns to minimize service disruptions.
- Develop and maintain service level objectives (SLOs) and service level indicators (SLIs) to measure and ensure system reliability.
- Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to production.
- Participate in an on-call rotation to respond to critical incidents and ensure 24/7 system availability.
Requirements
- Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent work experience.
- Proven experience as a software engineer focused on reliability or a similar role at a fast-paced, rapidly scaling company.
- Strong proficiency in cloud infrastructure and cloud operational practices.
- Proficiency in programming languages (unspecified in posting) and strong problem-solving and troubleshooting skills.
- Experience with containerization technologies and container orchestration platforms such as Kubernetes.
- Knowledge of Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
- Experience with observability tooling (examples provided: Datadog, Prometheus, Grafana, Splunk).
- Experience with microservices architecture and service mesh technologies.
- Knowledge of security best practices in cloud environments.
- Strong communication and collaboration skills; experience working cross-functionally.
The posting also highlights desirable attributes and ways to thrive in the role:
- Track record of accelerating engineering reliability by empowering engineers with tooling and systems.
- Humble attitude, eagerness to help colleagues, and willingness to own problems end-to-end.
- Proactive in identifying bottlenecks and areas for performance improvement.
- Use of Infrastructure as Code (IaC) principles to automate provisioning and configuration management.
Benefits & Compensation
- Base salary range listed: $255,000 – $490,000 (total compensation also includes equity and potential performance-related bonuses as described in the posting).
- Medical, dental, and vision insurance with employer contributions to Health Savings Accounts.
- Pre-tax accounts: Health FSA, Dependent Care FSA, commuter accounts where applicable.
- 401(k) retirement plan with employer match.
- Paid parental leave and additional medical/caregiver leave.
- Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees.
- 13+ paid company holidays and paid office closures throughout the year, plus paid sick/safe time where required by law.
- Mental health and wellness support; employer-paid basic life and disability coverage.
- Annual learning and development stipend.
- Daily meals in offices and meal delivery credits as eligible.
- Relocation support for eligible employees.
- Additional taxable fringe benefits (charitable donation matching, wellness stipends) may be provided.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. The company emphasizes safety and diverse perspectives, and is an equal opportunity employer. Background checks will be administered in accordance with applicable law. OpenAI provides reasonable accommodations to applicants with disabilities.