Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Kubernetes @ 4
Python @ 6
GCP @ 4
Algorithms @ 4
Distributed Systems @ 4
Machine Learning @ 4
AWS @ 4
Azure @ 4
Rust @ 6
Slack @ 4
LLM @ 3
Observability @ 4
AI @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Inference team builds and maintains the systems that serve Claude to millions of users worldwide, delivering compute-agnostic inference deployments across diverse AI accelerators and cloud platforms. The team focuses on maximizing compute efficiency and enabling research by providing high-performance inference infrastructure.
Responsibilities
- Design, build, and maintain distributed systems that serve Claude to millions of users worldwide
- Develop intelligent request routing, load balancing, and traffic management systems across thousands of accelerators
- Maximize compute efficiency across the fleet by autoscaling and orchestrating production, research, and experimental workloads
- Build and operate production-grade deployment pipelines for releasing new models to users
- Provide high-performance inference infrastructure that enables researchers to develop next-generation models
- Integrate new AI accelerator platforms and support inference for new model architectures
- Use observability data to tune and improve performance based on real-world production workloads
- Manage multi-region deployments and geographic routing for global customers
Requirements
Minimum qualifications:
- Significant software engineering experience, particularly with distributed systems
- Results-oriented, with a bias towards flexibility and impact
- Willingness to pick up slack, even if it goes outside your job description
- Enjoy pair programming
- Desire to learn more about machine learning systems and infrastructure
- Thrive in environments where technical excellence directly drives both business results and research breakthroughs
- Care about the societal impacts of your work
Preferred qualifications:
- Experience with high-performance, large-scale distributed systems
- Experience implementing and deploying machine learning systems at scale
- Experience with load balancing, request routing, or traffic management systems
- Familiarity with LLM inference optimization, batching, and caching strategies
- Experience with Kubernetes and cloud infrastructure (AWS, GCP, Azure)
- Proficiency in Python or Rust
Representative projects (examples of work you might do):
- Designing intelligent routing algorithms that optimize request distribution across thousands of accelerators
- Autoscaling the compute fleet to match supply with demand across production, research, and experimental workloads
- Building production-grade deployment pipelines for releasing new models to millions of users
- Integrating new AI accelerator platforms to maintain hardware-agnostic advantage
- Contributing to inference features (e.g., structured sampling, prompt caching)
- Supporting inference for new model architectures
- Analyzing observability data to tune performance based on real-world production workloads
Logistics & Additional Information
- Minimum education: Bachelor’s degree or equivalent combination of education, training, and/or experience
- Location-based hybrid policy: staff expected in one of Anthropic’s offices at least 25% of the time
- Visa sponsorship: Anthropic states they do sponsor visas and retain an immigration lawyer to assist, though not every role/candidate can be successfully sponsored
- Applications reviewed on a rolling basis
Benefits
- Competitive compensation and benefits
- Optional equity donation matching
- Generous vacation and parental leave
- Flexible working hours
- Office space for collaboration