Staff AI Engineer - Grafana Ops, AI/ML

USD 175,000-210,000 per year
SENIOR
✅ Remote

Used Tools & Technologies

Not specified

Required Skills & Competences

Docker @ 4 Grafana @ 4 Kubernetes @ 4 DevOps @ 4 Terraform @ 4 GCP @ 4 Leadership @ 4 AWS @ 4 Azure @ 4 Communication @ 4 Experimentation @ 4 LLM @ 4 Compliance @ 4 Observability @ 4 AI @ 4 GenAI @ 4 Prompt Engineering @ 4

Details

Grafana Labs is a remote-first, open-source powerhouse. There are more than 20M users of Grafana, the open source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted everywhere from a NASA launch and Minecraft HQ to Wimbledon and the Tour de France. Grafana Labs also helps more than 3,000 companies -- including Bloomberg, JPMorgan Chase, and eBay -- manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack, both featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).

We’re scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.

This is a remote opportunity and we would be interested in applicants from USA time zones only at this time.

Responsibilities

  • Build and deliver AI solutions: take ownership of developing high-performance AI features to help users detect, triage, and resolve incidents using observability data and tools.
  • Rapid experimentation and iteration: implement a highly iterative process where you quickly prototype, test, and validate with real users, including shipping and evolving LLM- or agent-powered workflows for incident lifecycle management and automated analysis tasks.
  • Collaborate cross-functionally: work with data analysts, product managers, and designers to shape AI-driven product features, including integration of agentic components with internal tools, alerting systems, runbooks, and developer workflows.
  • Utilize AI tools effectively: use AI and automation tools to enhance both product functionality and your own development workflows.
  • Effective communication: contribute across teams in a highly dynamic and collaborative environment.
  • Ownership and impact: take full ownership of AI solutions ensuring they are scalable, maintainable, and aligned with real user workflows.

Requirements / What Makes You a Great Fit

  • Strong engineering skills: solid experience building production software systems (backend and/or full stack). Ability to tackle complex engineering problems with minimal supervision.
  • AI experience with a practical mindset: familiarity with AI technologies and frameworks, and focus on delivering high-quality solutions that work in the real world.
  • Experience with LLMs, prompt engineering, and building applications powered by GenAI.
  • Proven track record of delivering software that made it into production and is actively used by users.
  • Exposure to working in cloud-native environments (e.g., AWS, GCP, Azure).
  • Experience using observability tools to understand and troubleshoot system behavior.
  • Quick iteration and experimentation: comfortable releasing prototypes, collecting feedback, and iterating.
  • Proven initiative and a collaborative attitude: ownership, ability to handle ambiguity, and effective communication with peers, product managers, and designers.

Bonus Points For

  • Experience building or working with agent frameworks or multi‑agent workflows.
  • Experience with infrastructure / DevOps related tooling: Kubernetes, Docker, Terraform or similar for deployments.
  • Familiarity with model fine-tuning techniques.
  • Experience building observability tooling.

Compensation & Rewards

In the United States, the Base compensation range for this role is USD 174,986 - USD 209,983. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process. Benefits include equity, bonus (if applicable) and other benefits listed by Grafana Labs. All roles include Restricted Stock Units (RSUs).

Why You’ll Thrive at Grafana Labs

  • 100% Remote, Global Culture — as a remote-only company, Grafana brings together talent from around the world.
  • Scaling Organization — tackle meaningful work in a high-growth environment.
  • Transparent Communication, Innovation-Driven culture, Open Source Roots, Empowered Teams, Career Growth Pathways, and Approachable Leadership.
  • In-person onboarding to help new hires integrate from day one.
  • Global annual leave policy of 30 days per annum (compliance with local legislation where applicable).

Equal Opportunity

Grafana Labs is an equal opportunity employer and will recruit, train, compensate and promote regardless of race, religion, color, national origin, gender, disability, age, veteran status, and other characteristics. The company may utilize AI tools in its recruitment process to assist in matching information provided in CVs to job postings; recruitment teams will continue to review CVs manually.