Staff Software Engineer - Grafana Cloud k6 | Spain | Remote
at Grafana Labs
📍 Spain
EUR 94,000-112,800 per year
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Docker @ 4
Go @ 7
Grafana @ 4
Kubernetes @ 4
DevOps @ 4
Python @ 7
Distributed Systems @ 4
Leadership @ 4
AWS @ 4
Communication @ 4
JavaScript @ 4
SRE @ 4
Prioritization @ 4
Reporting @ 4
Observability @ 4
AI @ 4
Change Management @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Grafana Labs is a remote-first, open-source company that builds observability tooling used worldwide. This role sits on the Grafana Cloud k6 squad (k6, Grafana Cloud k6, Grafana Cloud Synthetics), focused on performance testing at scale and ingesting large volumes of test data for analysis and correlation.
Responsibilities
- Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
- Drive mature DevOps/SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
- Establish reliability frameworks such as SLIs/SLOs and error budgets, and use them to guide prioritization and engineering trade-offs.
- Provide visibility into system health through operational metrics and reliability reporting.
- Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.
- Influence product and system direction through design reviews, architectural discussions, and cross-team collaboration.
- Share knowledge through high-quality documentation and technical communication—internally and, where appropriate, externally.
- As the reliability foundation matures, expand into broader application and product development leadership, contributing architectural and technical depth beyond operations.
- Use modern AI coding assistants as part of the workflow (company-funded usage budget), with emphasis on pragmatic AI-assisted development paired with code review and quality standards.
Requirements
- Strong experience with DevOps/SRE practices and operating/evolving production systems at scale.
- Strong programming background in a modern language (Python and Go are primary, but prior experience is not required).
- Experience designing, building, and operating large-scale distributed systems.
- Strong understanding of reliability engineering concepts (incident management, observability, failure modes).
- Experience with test automation, including performance and functional testing.
- Ability to influence engineering practices through clear technical communication, reviews, and collaboration.
- Strong interpersonal skills and ability to work effectively across teams.
- Familiarity with modern software engineering processes and delivery practices.
- Self-driven and comfortable operating with a high degree of autonomy and ambiguity.
Bonus
- Experience with containerized and cloud-native systems (Docker, Kubernetes, AWS).
- Familiarity with observability tooling and platforms (for example, the Grafana stack).
- Experience working with Python, Go, JavaScript and/or Jsonnet.
- Experience building or operating event-driven or asynchronous systems.
- Experience defining or applying SLIs/SLOs, error budgets, or reliability metrics.
- Interest in, or experience with, building testing frameworks or developer tooling.
Compensation
- In Spain, the base compensation range for this role is EUR 94,025 - EUR 112,830. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process.
- Compensation ranges are country-specific; candidates applying from other locations will discuss market-specific pay ranges with a recruiter.
Benefits & Other Notes
- Benefits include equity, bonus (if applicable), and other benefits listed on the company careers page.
- 100% remote, global culture; in-person onboarding is provided.
- Global annual leave policy of 30 days per annum (3 days reserved for Grafana Shutdown Days), subject to local legislation.
- Grafana Labs is an equal opportunity employer and may utilize AI tools in its recruitment process.