Software Engineer, Inference Deployment

USD 320,000-485,000 per year
MIDDLE
✅ Hybrid
✅ Visa Sponsorship

Used Tools & Technologies

Machine Learning

Required Skills & Competences

Kubernetes @ 5 Python @ 3 Communication @ 6 Rust @ 3 GPU @ 3 Observability @ 3 AI @ 3

Details

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The Launch Engineering team makes inference deployment continuous and unattended — moving inference code from merge to production across GPU, TPU, and Trainium fleets while minimizing disruption to serving capacity.

Responsibilities

  • Own deployment orchestration that continuously moves validated inference builds into production across GPU, TPU, and Trainium fleets, unattended under normal conditions
  • Improve capacity-aware deployment scheduling to maximize deployment throughput against constrained accelerator budgets and variable fleet sizes
  • Extend deployment observability — dashboards and tooling that answer "what code is running in production," "where is my commit," and "what validation passed for this deploy"
  • Drive down cycle time from code merge to production with pipeline architectures that minimize serial dependencies and maximize parallelism
  • Optimize fleet rollout strategies for large-scale deployments across thousands of GPU, TPU, and Trainium chips, minimizing disruption to serving capacity
  • Evolve self-service model onboarding so that new models can be added to the continuous deployment pipeline without Launch Engineering involvement
  • Partner across the Inference organization with teams owning validation, autoscaling, and model routing to integrate deployment automation with their systems

Requirements

  • 5+ years of experience building deployment, release, or delivery infrastructure at scale
  • Strong software engineering skills with experience designing systems that manage complex state machines and multi-stage pipelines
  • Experience with deployment systems where resource constraints shape the design — e.g., fleet capacity, network bandwidth, hardware availability, or coordinated rollout windows
  • A track record of building automation that measurably improves deployment velocity and reliability
  • Proficiency with Kubernetes-based deployments, rolling update mechanics, and container orchestration
  • Comfort working across the stack — from backend services and databases to CLI tools and web UIs
  • Strong communication skills and the ability to work closely with oncall engineers, model teams, and infrastructure partners

Strong candidates may also have

  • Experience with ML inference or training infrastructure deployment, particularly across multiple accelerator types (GPU, TPU, Trainium)
  • Background in capacity planning or resource-constrained scheduling (e.g., bin-packing, fleet management, job scheduling with hardware affinity)
  • Experience with progressive delivery in systems with long validation cycles: canary/soak testing, blue-green deployments, traffic shifting, automated rollback
  • Experience at companies with large-scale release engineering challenges (mobile release trains, monorepo deployments, multi-datacenter rollouts)
  • Experience with Python and/or Rust in production systems

Salary

  • Annual Salary: $320,000 - $485,000 USD

Logistics

  • Education requirements: Bachelor's degree in a related field or equivalent experience
  • Location-based hybrid policy: staff expected to be in one of Anthropic's offices at least 25% of the time

Visa sponsorship

  • Anthropic states they sponsor visas and will make reasonable efforts to obtain a visa for successful candidates; an immigration lawyer is retained to assist

Benefits / Other

  • Anthropic offers competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and office space for collaboration

How we're different

  • Anthropic emphasizes large-scale, collaborative research with strong communication and cross-team work; candidates are encouraged to read Anthropic research and apply even if they don't meet every listed qualification.