Used Tools & Technologies
GenAIRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Docker @ 4
Grafana @ 4
Kubernetes @ 4
DevOps @ 4
Terraform @ 4
GCP @ 4
AWS @ 4
Azure @ 4
Communication @ 4
Experimentation @ 4
LLM @ 4
Observability @ 4
Generative AI @ 4
AI @ 4
Prompt Engineering @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. Grafana Cloud, our fully managed observability platform, is flexible and built for scale. With Grafana Cloud's AI capabilities, organizations can see, understand, and act on all their disparate data to move at the speed of their ambitions. We are a 100% remote company with team members across many countries.
Location / Work Arrangement
This is a remote opportunity. The posting states that Grafana is interested in applicants from Canada time zones only at this time.
Responsibilities
- Build and deliver AI solutions: take ownership of developing high-performance AI features to help users detect, triage, and resolve incidents using observability data and tools.
- Rapid experimentation and iteration: prototype, test, validate with real users, and ship/evolve LLM- or agent-powered workflows for incident lifecycle management and automated analysis tasks.
- Collaborate cross-functionally with data analysts, product managers, and designers to shape AI-driven product features and integrate agentic components with internal tools, alerting systems, runbooks, and developer workflows.
- Utilize AI and automation tools to enhance product functionality and development workflows.
- Communicate effectively across teams in a dynamic, collaborative environment.
- Take full ownership of AI solutions ensuring they are scalable, maintainable, and aligned with user workflows.
Requirements
- Experience with LLMs, prompt engineering, and building applications powered by Generative AI.
- Proven track record of delivering software that made it into production and is actively used by users (backend and/or full-stack production systems experience).
- Exposure to cloud-native environments (e.g., AWS, GCP, Azure).
- Experience using observability tools to understand and troubleshoot system behavior.
- Strong engineering skills, ability to iterate quickly, and comfort working with ambiguity while defining scope and driving projects.
Bonus Points For
- Experience building or working with agent frameworks or multi-agent workflows.
- Experience with infrastructure/devops tooling such as Kubernetes, Docker, Terraform (or similar) for deployments.
- Familiarity with model fine-tuning techniques.
- Experience building observability tooling.
Compensation & Rewards
- In Canada, the base compensation range for this role is CAD 186,368 - CAD 230,000. Actual compensation may vary based on level, experience, and skillset.
- All roles include Restricted Stock Units (RSUs).
Benefits & Culture
- 100% remote company with a global culture.
- In-person onboarding to help new hires learn and connect with the team.
- Global annual leave policy of 30 days per annum (with 3 days reserved for Grafana Shutdown Days), subject to local legislation.
- Access to modern AI coding assistants and company-funded usage budget; access to frontier models (examples listed in posting).
- Emphasis on autonomy, collaboration, transparent communication, open-source roots, and career growth pathways.
Equal Opportunity
Grafana Labs is an equal opportunities employer and welcomes applications from all backgrounds. The company may utilize AI tools in its recruitment process to assist matching CVs to job postings, with manual review by the recruitment team.