Staff + Sr. Software Engineer, Cloud Inference

at Anthropic

📍 San Francisco, United States

USD 320,000-485,000 per year

SENIOR

✅ Hybrid

✅ Visa Sponsorship

Used Tools & Technologies

IaC Machine Learning

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Security @ 4 Kubernetes @ 4 Python @ 6 GCP @ 4 CI/CD @ 4 Distributed Systems @ 7 AWS @ 4 Azure @ 4 Communication @ 4 Networking @ 4 Rust @ 6 API @ 4 LLM @ 4 Observability @ 4 AI @ 4

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Cloud Inference team scales and optimizes Claude to serve developers and enterprise customers across AWS, GCP, Azure, and future cloud service providers. The team owns the end-to-end product of Claude on each cloud platform, including API integration, request routing, inference execution, capacity management, and operations.

This role focuses on building high-performance, large-scale backend services and infrastructure to serve LLMs across heterogeneous cloud providers while optimizing for reliability, cost, and performance.

Responsibilities

Design, build, and own backend services and infrastructure that serve Claude across multiple CSPs, accounting for differences in compute hardware, networking, APIs, and operational models
Work cross-functionally with internal inference, product API, systems, and security teams and with CSP partners to stand up the full serving stack on new cloud platforms and resolve operational issues
Build and evolve CI/CD automation systems, including validation and deployment pipelines that reliably ship new model versions at scale
Design interfaces and tooling abstractions across CSPs to enable cost-effective inference management and reduce per-platform complexity
Contribute to capacity planning, autoscaling, and workload routing strategies to match supply with demand and route requests to cost-effective accelerators and regions
Analyze observability data across providers to identify performance bottlenecks, cost anomalies, and regressions, and drive remediation based on production workloads

Minimum qualifications

Significant software engineering experience with a strong background in high-performance, large-scale distributed systems serving millions of users
Experience building or operating services on at least one major cloud platform (AWS, GCP, or Azure) with exposure to Kubernetes, Infrastructure as Code, or container orchestration
Curious about LLM serving (prior inference/ML experience not required)
Comfortable working cross-functionally with internal teams and external partners
Experience aligning goals and delivering impact with external partners
Fast learner who can quickly ramp on new technologies, hardware platforms, and provider ecosystems
Highly autonomous and able to take end-to-end ownership, including work outside a strict job description

Preferred qualifications

Direct experience working with CSPs to scale infrastructure or products across multiple platforms and navigating differences in networking, security, privacy, billing, and managed services
Hands-on experience with capacity management, cost optimization, or resource planning at scale across heterogeneous environments
Solid understanding of multi-region deployments, geographic routing, and global traffic management
Proficiency in Python or Rust

Compensation

Annual salary range: $320,000 - $485,000 USD

Logistics

Minimum education: Bachelor’s degree or equivalent combination of education, training, and/or experience
Location-based hybrid policy: staff are expected to be in one of Anthropic's offices at least 25% of the time; some roles may require more office presence

Visa sponsorship

Anthropic states that they do sponsor visas and retain an immigration lawyer to assist, though sponsorship is not guaranteed for every role/candidate

How we're different

Anthropic emphasizes large-scale, collaborative AI research with a focus on steerable and trustworthy systems. The team values communication and works on a few large-scale research efforts with high impact.