Sr. Software Engineer, Inference

at Anthropic

📍 London, United Kingdom

GBP 225,000-325,000 per year

SENIOR

✅ Hybrid

✅ Visa Sponsorship

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Kubernetes @ 4 Python @ 6 GCP @ 4 Distributed Systems @ 4 Machine Learning @ 4 AWS @ 4 Communication @ 7 Rust @ 6 LLM @ 4 Observability @ 4 AI @ 4

Details

Anthropic’s Inference team builds and maintains the systems that serve Claude to millions of users worldwide. The team owns the full stack from intelligent request routing to fleet-wide orchestration across diverse AI accelerators, operating large-scale, compute-agnostic inference deployments. The role focuses on maximizing compute efficiency for production workloads while enabling research by providing high-performance inference infrastructure.

Responsibilities

Design and implement distributed systems for inference at large scale, including intelligent routing and traffic management across thousands of accelerators.
Build autoscaling systems to match compute supply with demand across production and research workloads.
Develop production-grade deployment pipelines and release processes for models.
Integrate and support new AI accelerator platforms and maintain hardware-agnostic deployments.
Implement inference features such as batching, structured sampling, prompt caching, and other LLM inference optimizations.
Analyze observability data and tune performance for real-world production workloads.
Manage multi-region deployments and geographic routing for global customers.

Requirements

Significant software engineering experience, particularly with large-scale, high-performance distributed systems.
Experience implementing and deploying machine learning systems at scale.
Familiarity with load balancing, request routing, traffic management, autoscaling, batching, caching, and other inference optimization strategies.
Experience with Kubernetes and cloud infrastructure (AWS, GCP).
Proficiency in Python or Rust.
Strong results orientation and ability to work across responsibilities; good communication skills.
Education: at least a Bachelor's degree in a related field or equivalent experience.

Strong candidates may also have experience with

LLM inference optimization and productionization
Integrating new accelerator hardware and working across multiple cloud platforms
Observability and performance tuning for large-scale systems

Benefits and Perks

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours and hybrid work policy (expected in-office ~25% time for location-based roles)
Lovely office space

Logistics

Location: London, United Kingdom (hybrid; staff expected to be in office at least ~25% of the time)
Annual salary range: £225,000 - £325,000 GBP
Visa sponsorship: Anthropic states they do sponsor visas and retain immigration counsel to assist where possible.
Deadline to apply: None (applications reviewed on a rolling basis).