Engineering Manager - Site Reliability Engineer

USD 160,000-220,000 per year
SENIOR
✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kubernetes @ 4 DevOps @ 4 Kibana @ 3 GCP @ 4 Datadog @ 4 Leadership @ 4 AWS @ 4 Azure @ 4 Communication @ 7 Networking @ 4 SRE @ 4 Load Testing @ 4 Technical Leadership @ 4

Details

Ready to be pushed beyond what you think you’re capable of?

At Coinbase, our mission is to increase economic freedom in the world. It’s a massive, ambitious opportunity that demands the best of us every day, as we build the emerging onchain platform and the future global financial system.

We seek a candidate passionate about crypto and blockchain technology to update the financial system, eager to solve difficult problems, work with high-caliber colleagues, and actively seek feedback.

Our work culture is intense and requires in-person participation throughout the year, with multiple team and company-wide offsites.

Responsibilities

  • Promote reliability culture across Coinbase.
  • Help scale system by 10-20x and secure service configurations & secrets by building/enhancing service configuration manager systems.
  • Reduce customer incidents by building/enhancing Safe Release (canary based deployment systems) capabilities and onboarding thousands of services deploying hundreds daily.
  • Hire and retain top talent.
  • Build trust and relationships with cross-functional teams to make embedded SRE programs successful.
  • Collaborate with engineers, product managers, and leadership to develop strategy and detailed roadmap.
  • Actively listen to customer feedback and iterate for improvements.
  • Provide technical leadership in architectural decisions and maintain a culture of high-quality code and processes.
  • Own team's processes and services ensuring SLA adherence.
  • Work closely with talent organization to recruit engineers aligned with Coinbase's culture.
  • Coach direct reports to positively impact the organization and support their growth.

Requirements

  • Minimum 10+ years of software engineering/SRE experience, including 2+ years of management experience.
  • Knowledge in SRE, DevOps, incident management, and reliability tooling like Canary, load testing.
  • Experience with public cloud infrastructure (Kubernetes, Load Balancer, Auto-Scaling), networking basics, observability tools (Datadog), and troubleshooting.
  • Strong communication and interpersonal skills.
  • Critical thinker under pressure.
  • Willingness to understand, debug, and improve any layer of the stack.

Nice to Haves

  • Experience designing and building reliable, high-throughput, low-latency systems.
  • Experience with high severity incident management and on-call support.
  • Familiarity with observability and monitoring systems such as Kibana, Datadog.
  • Experience with AWS, GCP, Azure, or other cloud environments.
  • Experience in highly regulated environments.
  • Experience writing company-facing blog posts and training materials.

Benefits

  • Medical, dental, and vision insurance for employees and dependents.
  • Group personal accidental and term life insurance.
  • Employee Stock Purchase Plan (ESPP).
  • Wellness stipend, mobile/internet reimbursement, connections stipend.
  • Learning and development allowance.
  • Employee assistance program.
  • Global traveler travel medical policy.
  • Fertility benefits.
  • Generous time off/leave policy.