Senior AI Engineer

at Grafana Labs

📍 Canada

CAD 164,500-197,400 per year

SENIOR

✅ Remote

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Marketing @ 4 Grafana @ 4 Python @ 7 GCP @ 4 Airflow @ 4 GitHub @ 6 CI/CD @ 4 Hiring @ 4 Communication @ 7 Git @ 4 JavaScript @ 7 React @ 4 Node.js @ 7 Microservices @ 4 Slack @ 4 API @ 4 Workato @ 4 LLM @ 4 Audit @ 4 Compliance @ 4 Salesforce @ 3 Claude Code @ 6 Observability @ 4 AI @ 4 RAG @ 4 LangChain @ 4 Prompt Engineering @ 4

Details

Grafana Labs is hiring a Senior Engineer (AI & Automation) to own the AI agent infrastructure and automation platform that powers Marketing Operations. This is a remote opportunity for candidates located in Canada (residents of Quebec are not eligible). You will design and ship multi-agent architectures, LLM integrations, and backend services that connect AI models to internal and third-party data platforms and deliver production systems used daily by business teams.

Responsibilities

Own end-to-end development of multi-agent AI systems, from architecture and implementation through testing, deployment, and ongoing operation
Build modular, composable agentic systems using orchestration frameworks (LangChain, CrewAI, Anthropic MCP, or similar) that operate 24/7 across teams
Develop reusable agentic skills that agents invoke across interfaces (Slack, dashboards, internal apps, CLIs)
Implement observability and feedback loops including logging, performance metrics, prompt iteration, model evaluation, and cost management
Establish governance and compliance standards for AI workflows including access controls, audit trails, PII handling, and human-in-the-loop escalation paths
Build MCP servers, APIs, CLIs, and microservices connecting AI models to business systems (BigQuery, Slack, CRMs, email, calendars, analytics tools)
Architect data flows for retrieval-augmented generation (RAG), connecting LLMs to internal knowledge bases, customer data, and real-time business context
Build serverless or containerized services (GCP Cloud Functions, Cloud Run) that scale with usage and integrate with Grafana's cloud infrastructure
Partner with RevOps, Demand Generation, Regional Marketing, and SDR teams to scope high-impact automation problems and build measurable solutions
Design and deploy workflows using orchestration tools (n8n, Workato, or custom platforms) with CI/CD, testing, and production reliability standards
Produce documentation, playbooks, and enablement materials to allow partner teams to operate independently

Requirements

8+ years of software engineering experience with depth in backend development, systems integration, or data/analytics engineering
2+ years hands-on experience applying LLMs/AI to production workflows (beyond prototypes)
Strong proficiency in Python and JavaScript/Node.js with Git-based workflows, code review practices, and testing discipline
Hands-on experience with LLM frameworks and patterns including prompt engineering, RAG, function calling/tool use, structured output parsing, and evaluation
Experience building and operating multi-agent systems at scale including agent decomposition, orchestration patterns, state management, and production monitoring
Deep familiarity with Google Cloud Platform, BigQuery, and serverless/containerized services (Cloud Functions, Cloud Run)
Understanding of LLM failure modes and production mitigations including confidence thresholds, fallback logic, human escalation, and cost/latency management
Proven ability to identify high-leverage problems and deliver end-to-end with minimal direction
Fluent with AI-assisted development tools (GitHub Copilot, Cursor, Claude Code) and pragmatic AI-assisted development practices
Strong communication skills to explain complex systems to both engineers and business stakeholders

Bonus Points

Experience with vector databases or retrieval pipelines (Pinecone, Weaviate, ChromaDB, Qdrant, pgvector)
Familiarity with marketing or sales platforms (Salesforce, Customer.io, HubSpot, Marketo, Outreach)
Experience with frontend frameworks (React, Slack Block Kit) for building user-facing AI tool interfaces
Observability tooling for AI systems (LangSmith, Weights & Biases, custom evaluation frameworks)
Experience with workflow orchestration platforms (n8n, Temporal, Prefect, Airflow)
Familiarity with Model Context Protocol (MCP) or similar standards
Prior work automating marketing, sales, or customer success workflows in a B2B SaaS environment
Active participation in open-source communities

Compensation

In Canada, the base compensation range for this role is CAD 164,490 - CAD 197,389. Actual compensation may vary based on level, experience, and skillset as assessed throughout the interview process.
All roles include Restricted Stock Units (RSUs).

Why You’ll Thrive at Grafana Labs

100% remote, global culture with a high-trust, low-ego environment
Scaling organization with meaningful work and transparent communication
Strong emphasis on developer productivity and AI-assisted development
In-person onboarding and generous annual leave policy (30 days per annum, includes Grafana Shutdown Days)