Staff Software Engineer - Databases SRE

📍 Germany
📍 Spain
📍 United Kingdom
📍 Sweden
EUR 109,700-131,700 per year
SENIOR
✅ Remote

Used Tools & Technologies

Not specified

Required Skills & Competences

Go @ 4 Grafana @ 4 Kubernetes @ 7 Linux @ 4 Terraform @ 3 Python @ 4 GCP @ 4 Java @ 4 Hiring @ 4 Leadership @ 7 AWS @ 4 Azure @ 4 Helm @ 3 Mentoring @ 7 Networking @ 4 SRE @ 4 Technical Leadership @ 7 Design Patterns @ 4 Observability @ 4 AI @ 4

Details

Grafana Labs is hiring a Staff Software Engineer (SRE) to support Grafana Cloud’s database products (Mimir, Loki, Tempo, Pyroscope) delivered as SaaS from AWS, GCP, and Azure across regions. This role is embedded within the Mimir, Loki, and Tempo squads and focuses on increasing reliability for high-SLA customers. This is a remote opportunity targeting candidates located in the United Kingdom, Sweden, Spain, or Germany.

Responsibilities

  • Partner closely with product engineering squads (embedded model)
  • Own production reliability for high-SLA and complex customer environments
  • Design and implement automation to scale reliability practices
  • Ensure customers meet SLO targets and define/evolve per-tenant SLOs and reliability models
  • Proactively reduce SLO burn to prevent repeat incidents
  • Serve as a primary escalation point and participate on-call for relevant incidents
  • Lead customer-impacting incident response and post-incident reviews (PIRs)
  • Contribute to design docs and code reviews
  • Influence feature design to ensure production scalability and operability
  • Build automation to eliminate toil and improve alert quality, reducing noisy escalations

Requirements

  • 8+ years engineering experience, with 4+ years in SRE/CRE/production engineering (strong preference for formal customer reliability engineering experience)
  • Strong Kubernetes experience in AWS, GCP, or Azure
  • Familiarity with infrastructure-as-code tooling such as Helm, Terraform, Jsonnet
  • Experience operating multi-tenant systems in production
  • Strong experience designing and implementing SLOs
  • Experience with one or more programming languages (examples given: Go, Python, Java)
  • Knowledge of Linux operating system internals, networking, cloud storage, and scaling
  • Excellent problem-solving and troubleshooting skills
  • Experience in calm, blame-free Incident Response, follow-up actions, and writing high-quality Post Incident Reviews (PIRs)
  • Ability to reason about performance, scaling, and failure modes
  • Strong technical leadership: leading projects, mentoring engineers, and serving as a force-multiplier
  • Ability to partner deeply with product engineering teams and work autonomously

Your day-to-day

  • Regular 1:1s with your manager and colleagues
  • Review and create SLOs, investigate and reduce budget burn via monitoring, automation, self-healing, auto-scaling, etc.
  • Improve observability of customers within their environments
  • Design and implement solutions to ensure reliability and scalability
  • Develop fault-tolerant design patterns and consider reliability during the service lifecycle
  • Collaborate with Engineering Leaders on product strategy, roadmaps, and technical designs
  • Participate in PR review and design doc collaboration
  • Teach SRE best practices and participate in incident response including Bridge calls when necessary

Compensation & Benefits

  • Germany base compensation range: EUR 109,709 - EUR 131,651 (actual compensation may vary by level, experience, and assessed skillset)
  • Benefits include equity, bonus (if applicable), and other benefits detailed on Grafana Labs careers pages
  • 100% remote company with in-person onboarding
  • Global annual leave policy of 30 days per annum (3 days reserved for Grafana Shutdown Days). Local legislation will be complied with where applicable.

Why You’ll Thrive at Grafana Labs

  • 100% remote, global culture
  • Scaling organization with meaningful work and transparency
  • Open-source roots and empowered teams
  • Career growth pathways and approachable leadership
  • Culture valuing curiosity, transparency, bias toward action, and kindness

Other notes

  • This role is open to candidates located in the United Kingdom, Sweden, Spain, or Germany.
  • Compensation ranges are country-specific; other country applicants will receive market-specific pay range information during the hiring process.
  • Grafana Labs may utilize AI tools in recruitment; manual review of CVs is still performed.