Staff / Senior Software Engineer, Cloud Inference

USD 300,000-485,000 per year
SENIOR
✅ Hybrid
✅ Visa Sponsorship

Used Tools & Technologies

Machine Learning GPU

Required Skills & Competences

Kubernetes @ 4 Python @ 6 GCP @ 4 CI/CD @ 4 Distributed Systems @ 7 AWS @ 4 Azure @ 4 Networking @ 4 Rust @ 6 API @ 4 LLM @ 4 Observability @ 4 AI @ 4

Details

Anthropic’s Cloud Inference team scales and optimizes Claude to serve developers and enterprise customers across AWS, GCP, Azure, and future cloud service providers. The team owns end-to-end serving for Claude on each cloud platform, including API integration, request routing, inference execution, capacity management, and operations. Engineers on this team make infrastructure decisions that improve scale, cost-effectiveness, and reliability for large-scale LLM inference.

Responsibilities

  • Design and build infrastructure to serve Claude across multiple CSPs, accounting for differences in compute hardware, networking, APIs, and operational models
  • Collaborate with CSP partner engineering teams to resolve operational issues, influence provider roadmaps, and stand up end-to-end serving on new cloud platforms
  • Design and evolve CI/CD automation systems, including validation and deployment pipelines, to reliably ship new model versions across cloud platforms
  • Design interfaces and tooling abstractions across CSPs to enable cost-effective inference management and reduce per-platform complexity
  • Contribute to capacity planning and autoscaling strategies that dynamically match supply with demand across validation and production workloads
  • Optimize inference cost and performance across providers by designing workload placement and routing systems that choose cost-effective accelerators and regions
  • Contribute to inference features that must work consistently across platforms
  • Analyze observability data across providers to identify performance bottlenecks, cost anomalies, and regressions, and drive remediation based on production workloads

Requirements

  • Significant software engineering experience with a strong background in high-performance, large-scale distributed systems serving millions of users
  • Experience building or operating services on at least one major cloud platform (AWS, GCP, or Azure); exposure to Kubernetes, Infrastructure as Code, or other container orchestration systems
  • Strong interest in inference and familiarity with LLM inference optimization, batching, caching, and serving strategies
  • Experience or interest in capacity management, cost optimization, and resource planning at scale across heterogeneous environments
  • Ability to collaborate cross-functionally with internal teams and external CSP partners
  • Highly autonomous, fast learner, and able to take ownership of problems end-to-end
  • Education: at least a Bachelor's degree in a related field or equivalent experience

Strongly Preferred / Nice-to-Have

  • Direct experience working with CSP partner teams to scale infrastructure across multiple platforms
  • Background building platform-agnostic tooling or abstraction layers across cloud providers
  • Hands-on experience with ML infrastructure (GPUs, TPUs, Trainium, or other AI accelerators)
  • Experience designing and building CI/CD systems that automate deployment and validation across cloud environments
  • Solid understanding of multi-region deployments, geographic routing, and global traffic management
  • Proficiency in Python or Rust

Logistics

  • Location: San Francisco, CA and Seattle, WA
  • Location-based hybrid policy: we expect staff to be in an office at least 25% of the time
  • Visa sponsorship: Anthropic states they do sponsor visas and retain an immigration lawyer to assist when possible
  • Education requirement: Bachelor's degree or equivalent experience

Compensation & Benefits

  • Annual salary range: $300,000 - $485,000 USD
  • Anthropic offers competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office space for collaboration.