Staff + Senior Software Engineer, Inference

at Anthropic

📍 New York City, United States
📍 San Francisco, United States
📍 Seattle, United States

USD 320,000-485,000 per year

SENIOR

✅ Hybrid

✅ Visa Sponsorship

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Kubernetes @ 4 Python @ 6 GCP @ 4 Algorithms @ 4 Distributed Systems @ 4 Machine Learning @ 4 AWS @ 4 Azure @ 4 Rust @ 6 Slack @ 4 LLM @ 3 Observability @ 4 AI @ 4

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Inference team builds and maintains the systems that serve Claude to millions of users worldwide, delivering compute-agnostic inference deployments across diverse AI accelerators and cloud platforms. The team focuses on maximizing compute efficiency and enabling research by providing high-performance inference infrastructure.

Responsibilities

Design, build, and maintain distributed systems that serve Claude to millions of users worldwide
Develop intelligent request routing, load balancing, and traffic management systems across thousands of accelerators
Maximize compute efficiency across the fleet by autoscaling and orchestrating production, research, and experimental workloads
Build and operate production-grade deployment pipelines for releasing new models to users
Provide high-performance inference infrastructure that enables researchers to develop next-generation models
Integrate new AI accelerator platforms and support inference for new model architectures
Use observability data to tune and improve performance based on real-world production workloads
Manage multi-region deployments and geographic routing for global customers

Requirements

Minimum qualifications:

Significant software engineering experience, particularly with distributed systems
Results-oriented, with a bias towards flexibility and impact
Willingness to pick up slack, even if it goes outside your job description
Enjoy pair programming
Desire to learn more about machine learning systems and infrastructure
Thrive in environments where technical excellence directly drives both business results and research breakthroughs
Care about the societal impacts of your work

Preferred qualifications:

Experience with high-performance, large-scale distributed systems
Experience implementing and deploying machine learning systems at scale
Experience with load balancing, request routing, or traffic management systems
Familiarity with LLM inference optimization, batching, and caching strategies
Experience with Kubernetes and cloud infrastructure (AWS, GCP, Azure)
Proficiency in Python or Rust

Representative projects (examples of work you might do):

Designing intelligent routing algorithms that optimize request distribution across thousands of accelerators
Autoscaling the compute fleet to match supply with demand across production, research, and experimental workloads
Building production-grade deployment pipelines for releasing new models to millions of users
Integrating new AI accelerator platforms to maintain hardware-agnostic advantage
Contributing to inference features (e.g., structured sampling, prompt caching)
Supporting inference for new model architectures
Analyzing observability data to tune performance based on real-world production workloads

Logistics & Additional Information

Minimum education: Bachelor’s degree or equivalent combination of education, training, and/or experience
Location-based hybrid policy: staff expected in one of Anthropic’s offices at least 25% of the time
Visa sponsorship: Anthropic states they do sponsor visas and retain an immigration lawyer to assist, though not every role/candidate can be successfully sponsored
Applications reviewed on a rolling basis

Benefits

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Office space for collaboration