Used Tools & Technologies
Machine Learning GPURequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Kubernetes @ 4
Python @ 6
GCP @ 4
CI/CD @ 4
Distributed Systems @ 7
AWS @ 4
Azure @ 4
Networking @ 4
Rust @ 6
API @ 4
LLM @ 4
Observability @ 4
AI @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Anthropic’s Cloud Inference team scales and optimizes Claude to serve developers and enterprise customers across AWS, GCP, Azure, and future cloud service providers. The team owns end-to-end serving for Claude on each cloud platform, including API integration, request routing, inference execution, capacity management, and operations. Engineers on this team make infrastructure decisions that improve scale, cost-effectiveness, and reliability for large-scale LLM inference.
Responsibilities
- Design and build infrastructure to serve Claude across multiple CSPs, accounting for differences in compute hardware, networking, APIs, and operational models
- Collaborate with CSP partner engineering teams to resolve operational issues, influence provider roadmaps, and stand up end-to-end serving on new cloud platforms
- Design and evolve CI/CD automation systems, including validation and deployment pipelines, to reliably ship new model versions across cloud platforms
- Design interfaces and tooling abstractions across CSPs to enable cost-effective inference management and reduce per-platform complexity
- Contribute to capacity planning and autoscaling strategies that dynamically match supply with demand across validation and production workloads
- Optimize inference cost and performance across providers by designing workload placement and routing systems that choose cost-effective accelerators and regions
- Contribute to inference features that must work consistently across platforms
- Analyze observability data across providers to identify performance bottlenecks, cost anomalies, and regressions, and drive remediation based on production workloads
Requirements
- Significant software engineering experience with a strong background in high-performance, large-scale distributed systems serving millions of users
- Experience building or operating services on at least one major cloud platform (AWS, GCP, or Azure); exposure to Kubernetes, Infrastructure as Code, or other container orchestration systems
- Strong interest in inference and familiarity with LLM inference optimization, batching, caching, and serving strategies
- Experience or interest in capacity management, cost optimization, and resource planning at scale across heterogeneous environments
- Ability to collaborate cross-functionally with internal teams and external CSP partners
- Highly autonomous, fast learner, and able to take ownership of problems end-to-end
- Education: at least a Bachelor's degree in a related field or equivalent experience
Strongly Preferred / Nice-to-Have
- Direct experience working with CSP partner teams to scale infrastructure across multiple platforms
- Background building platform-agnostic tooling or abstraction layers across cloud providers
- Hands-on experience with ML infrastructure (GPUs, TPUs, Trainium, or other AI accelerators)
- Experience designing and building CI/CD systems that automate deployment and validation across cloud environments
- Solid understanding of multi-region deployments, geographic routing, and global traffic management
- Proficiency in Python or Rust
Logistics
- Location: San Francisco, CA and Seattle, WA
- Location-based hybrid policy: we expect staff to be in an office at least 25% of the time
- Visa sponsorship: Anthropic states they do sponsor visas and retain an immigration lawyer to assist when possible
- Education requirement: Bachelor's degree or equivalent experience
Compensation & Benefits
- Annual salary range: $300,000 - $485,000 USD
- Anthropic offers competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office space for collaboration.