TL, Research Inference

at OpenAI
USD 380,000-555,000 per year
MIDDLE
βœ… On-site
βœ… Relocation

Used Tools & Technologies

Not specified

Required Skills & Competences

Distributed Systems @ 3 Communication @ 3 Debugging @ 3 GPU @ 3 Observability @ 3 AI @ 3 Profiling @ 3

Details

The Foundations team studies how model behavior changes as models, data, and compute scale. The team investigates interactions between model architecture, optimization, and training data and uses those insights to guide model design and training. This role builds systems that enable advanced AI models to run efficiently at scale, operating at the intersection of model research and systems engineering to translate architectural ideas into high-performance inference systems that expose tradeoffs in performance, memory, and scalability. This is a research-enabling systems role (not product-serving) focused on performance, correctness, and realism.

Responsibilities

  • Design and build high-performance inference runtimes for large-scale AI models, focusing on efficiency, reliability, and scalability.
  • Own and optimize core execution paths, including model execution, memory management, batching, and scheduling.
  • Develop and improve distributed inference across multiple GPUs, including parallelism strategies, communication patterns, and runtime coordination.
  • Implement and optimize inference-critical operators and kernels informed by real-world workloads.
  • Partner closely with research teams to ensure new model architectures are supported accurately and efficiently in inference systems.
  • Diagnose and resolve performance bottlenecks through profiling, benchmarking, and low-level debugging.
  • Contribute to observability, correctness, and reliability of large-scale AI systems.

Requirements / Qualifications

  • Experience building production inference systems (beyond training or ad-hoc model runs).
  • Comfort with GPU-centric performance engineering, including GPU memory behavior and latency/throughput tradeoffs.
  • Experience with multi-GPU or distributed systems involving batching, scheduling, or runtime coordination.
  • Ability to reason end-to-end about inference pipelines, from request handling through execution and output streaming.
  • Ability to understand research ideas and implement them within real system and performance constraints.
  • Experience solving large-scale, ambiguous systems problems and preference for hands-on technical ownership and execution.

Benefits

  • Base pay range listed for the role (see posting): $380K – $555K; total compensation may include equity and performance-related bonuses.
  • Medical, dental, and vision insurance for employees and families, with employer contributions to Health Savings Accounts.
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses.
  • 401(k) retirement plan with employer match.
  • Paid parental leave and medical/caregiver leave.
  • Paid time off (flexible PTO for exempt employees; up to 15 days annually for non-exempt employees), 13+ paid company holidays, and paid sick/safe time as required by law.
  • Mental health and wellness support; employer-paid basic life and disability coverage.
  • Annual learning and development stipend; daily meals in offices and meal delivery credits as eligible.
  • Relocation support for eligible employees.
  • Additional fringe benefits such as charitable donation matching and wellness stipends may be provided.