Performance Engineer

USD 315,000-560,000 per year
MIDDLE
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Algorithms @ 3 Distributed Systems @ 3 Machine Learning @ 3 Debugging @ 3 GPU @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The team builds large-scale, beneficial AI systems and includes researchers, engineers, policy experts, and business leaders. As a Performance Engineer you will identify and solve novel systems problems that arise from running machine learning algorithms at scale, and develop systems to optimize throughput and robustness of large distributed ML systems.

Responsibilities

  • Identify and solve large-scale systems problems related to running ML workloads.
  • Implement low-latency, high-throughput sampling for large language models.
  • Implement GPU kernels to adapt models for low-precision inference.
  • Write custom load-balancing algorithms to optimize serving efficiency.
  • Build quantitative models of system performance.
  • Design and implement fault-tolerant distributed systems operating across complex network topologies.
  • Debug kernel-level network latency spikes in containerized environments.
  • Pair program and collaborate frequently with other engineers and researchers.

Requirements

  • Significant software engineering or machine learning experience, particularly at supercomputing or large-scale ML systems.
  • Experience or strong interest in GPU/accelerator programming, ML framework internals, and OS internals.
  • Familiarity with language modeling (transformers) and large-scale ML serving/engineering concerns.
  • Comfortable building quantitative performance models and debugging low-level system issues (including kernel-level network latency in containers).
  • We require at least a Bachelor's degree in a related field or equivalent experience.
  • Results-oriented, flexible, collaborative, and interested in learning more about ML research. Pair programming is emphasized.

Representative Projects / Strong Candidate Experience

  • High-performance, large-scale ML systems work
  • GPU/Accelerator programming and GPU kernel development
  • ML framework internals and low-precision inference adaptation
  • Designing and operating fault-tolerant distributed systems
  • Load-balancing algorithms and serving efficiency optimization
  • Debugging OS/kernel-level performance issues in containerized environments

Logistics & Other Details

  • Locations: San Francisco, CA; New York City, NY; Seattle, WA (United States).
  • Location-based hybrid policy: staff are expected to be in one of the offices at least 25% of the time; some roles may require more on-site time.
  • Visa sponsorship: Anthropic does sponsor visas and retains immigration counsel; sponsorship availability may vary by role/candidate.
  • Education: Minimum Bachelor's degree or equivalent experience required.
  • Deadline to apply: None (applications reviewed on a rolling basis).
  • Diversity note: Encouraged to apply even if you do not meet every qualification; Anthropic values diverse perspectives and considers societal impacts of AI.

Benefits / Culture

  • Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and collaborative office space.
  • Emphasis on high-impact, large-scale AI research and frequent research discussions.
  • Guidance on candidate AI usage is provided in the application process.