Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Marketing @ 4
Grafana @ 4
Python @ 7
GCP @ 4
Airflow @ 4
GitHub @ 4
CI/CD @ 4
Communication @ 4
Git @ 4
JavaScript @ 7
React @ 4
Node.js @ 7
Microservices @ 4
Slack @ 4
API @ 4
Workato @ 4
LLM @ 4
Audit @ 4
Compliance @ 4
Salesforce @ 3
Codex @ 4
Claude Code @ 4
Observability @ 4
AI @ 4
RAG @ 4
LangChain @ 4
Prompt Engineering @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Grafana Labs is the company behind the open observability cloud. Grafana Cloud is a fully managed observability platform built for scale, combining open source, open standards, open ecosystems, and an open culture. Grafana Labs is a 100% remote company with team members across 40+ countries.
This is a remote opportunity for candidates based in the United States.
Responsibilities
Agentic Systems & AI Infrastructure
- Own end-to-end development of multi-agent AI systems, from architecture and implementation through testing, deployment, and ongoing operation
- Build modular, composable agentic systems using orchestration frameworks (LangChain, CrewAI, Anthropic MCP, or similar) that operate 24/7 across teams
- Develop reusable agentic skills that agents invoke across interfaces (Slack, dashboards, internal apps, CLIs)
- Implement observability and feedback loops including logging, performance metrics, prompt iteration, model evaluation, and cost management
- Establish governance and compliance standards for AI workflows including access controls, audit trails, PII handling, and human-in-the-loop escalation paths
Systems Integration & Backend Services
- Build MCP servers, APIs, CLIs, and microservices connecting AI models to business systems (BigQuery, Slack, CRMs, email, calendars, analytics tools)
- Architect data flows for retrieval-augmented generation (RAG), connecting LLMs to internal knowledge bases, customer data, and real-time business context
- Build serverless or containerized services (GCP Cloud Functions, Cloud Run) that scale with usage and integrate with Grafana's cloud infrastructure
Automation & Workflow Enablement
- Partner with RevOps, Demand Generation, Regional Marketing, and SDR teams to scope high-impact automation problems, identify bottlenecks, and build solutions with measurable business outcomes
- Design and deploy workflows using orchestration tools (n8n, Workato, or custom platforms) with CI/CD, testing, and production reliability standards
- Build systems designed for self-service with documentation, playbooks, and enablement materials that let partner teams operate independently
Grafana invests heavily in developer productivity and provides access to AI coding assistants (Claude Code, Gemini CLI, OpenAI Codex, etc.) while maintaining code review and quality standards.
Requirements
- 8+ years of software engineering experience with depth in backend development, systems integration, or data/analytics engineering
- 2+ years hands-on experience applying LLMs/AI to production workflows (beyond prototypes)
- Strong proficiency in Python and JavaScript/Node.js with Git-based workflows, code review practices, and testing discipline
- Hands-on experience with LLM frameworks and patterns including prompt engineering, RAG, function calling/tool use, structured output parsing, and evaluation
- Experience building and operating multi-agent systems at scale including agent decomposition, orchestration patterns (sequential chains, router/dispatcher, parallel fan-out), state management, and production monitoring
- Familiarity with Google Cloud Platform, BigQuery, and serverless/containerized services (Cloud Functions, Cloud Run)
- Understanding of LLM failure modes and production mitigations including confidence thresholds, fallback logic, human escalation, and cost/latency management
- Strong problem diagnosis skills and the ability to deliver end-to-end with minimal direction
- Comfortable using AI-assisted development tools (GitHub Copilot, Cursor, Claude Code) to build AI systems
- Clear technical communication skills for both engineers and business stakeholders
Nice to Have / Bonus
- Experience with vector databases or retrieval pipelines (Pinecone, Weaviate, ChromaDB, Qdrant, pgvector)
- Familiarity with marketing or sales platforms (Salesforce, Customer.io, HubSpot, Marketo, Outreach)
- Experience with frontend frameworks (React, Slack Block Kit) for building user-facing AI tool interfaces
- Observability tooling for AI systems (LangSmith, Weights & Biases, custom evaluation frameworks)
- Experience with workflow orchestration platforms (n8n, Temporal, Prefect, Airflow)
- Familiarity with Model Context Protocol (MCP) or similar standards for connecting AI systems to data sources
- Prior work automating marketing, sales, or customer success workflows in a B2B SaaS environment
- Active participation in open-source communities
Compensation & Benefits
- Base compensation range in the United States: USD $154,445 - USD $185,334 per year
- Roles include Restricted Stock Units (RSUs)
- 100% remote company with global annual leave policy of 30 days per annum (3 days reserved for Grafana Shutdown Days; local legislation compliance as applicable)
- In-person onboarding
Equal Opportunity
Grafana Labs is an equal opportunities employer and may utilize AI tools in recruitment alongside manual review by the recruitment team.