Senior AI Engineer, GenAI & ML Evaluation Frameworks
at Grafana Labs
π Canada
CAD 164,500-197,400 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Grafana @ 4 CI/CD @ 4 Communication @ 4 LLM @ 4Details
Grafana Labs is a remote-first, open-source company powering observability with Grafana. The Grafana AI teams build AI-driven features to help users make sense of complex observability data and reduce toil. This role is a remote opportunity for applicants in Canadian time zones only.
Responsibilities
- Design and implement robust evaluation frameworks for Generative AI and LLM-based systems, including golden test sets, regression tracking, LLM-as-judge methods, and structured output verification.
- Develop tooling and automated evaluation pipelines to enable low-friction evaluation of model outputs, prompts, and agent behaviors.
- Integrate evaluation tooling and pipelines into CI/CD workflows and scale automated evaluation processes.
- Define and refine metrics for both structural and semantic correctness, ensuring alignment with realistic use cases and operational constraints.
- Lead dataset management processes and guide teams across Grafana in best practices for GenAI evaluation.
Requirements
- Experience designing and implementing evaluation frameworks for AI/ML systems.
- Familiarity with prompt engineering, structured output evaluation, and context-window management for LLM systems.
- Ability to translate team goals into clear, testable criteria and effective tooling with high autonomy.
Bonus Qualifications
- Experience working in environments with rapid iteration and experimental development.
- A pragmatic mindset emphasizing reproducibility, developer experience, and thoughtful trade-offs when scaling GenAI systems.
- Passion for minimizing human toil and building AI systems that actively support engineers.
Compensation & Benefits
- Base compensation range (Canada): CAD 164,490 - CAD 197,389. Actual compensation may vary by level, experience, and skillset.
- All roles include Restricted Stock Units (RSUs).
- 100% remote company culture, global collaboration, in-person onboarding, 30 days annual leave (with 3 reserved for Grafana Shutdown Days), transparent communication, and career growth pathways.
Additional Notes
- Grafana Labs may utilize AI tools in its recruitment process to assist matching CVs to job postings. The recruitment team will manually review inbound CVs.
- This role is remote but currently limited to applicants in Canada time zones.