Used Tools & Technologies
Machine LearningRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Go @ 7
Kafka @ 3
Kubernetes @ 4
Redis @ 3
Python @ 7
GCP @ 4
Java @ 7
Distributed Systems @ 4
AWS @ 4
Azure @ 4
gRPC @ 4
Debugging @ 7
API @ 4
Engineering Management @ 4
LLM @ 4
OpenTelemetry @ 7
Observability @ 4
AI @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
About Glean
Glean is the Work AI platform that helps everyone work smarter with AI. What began as advanced enterprise search has evolved into a Work AI ecosystem powering intelligent Search, an AI Assistant, and scalable AI agents on one secure, open platform. Glean provides enterprise connectors, flexible LLM choice, and APIs to govern, scale, and customize AI across organizations. The company emphasizes an Enterprise Graph and Personal Knowledge Graph to deliver personalized, context-aware responses and powers agentic capabilities that automate work across teams.
About the role
The Tech Lead Manager of the Agentic Runtime team builds the low-latency, reliable, and secure foundation that powers Glean’s AI agents and assistant experiences at scale. You will design and operate core runtime services for multi-turn orchestration, tool calling, model routing, memory, streaming, and safety. The role spans distributed systems, production observability, and ML infra integrations to deliver an experience that is fast, accurate, trustworthy, and cost-effective.
Responsibilities
- Own runtime problems end-to-end: architecture, design, production launch, and ongoing reliability.
- Build and evolve core services for session lifecycle, streaming responses (gRPC/WebSockets), structured tool execution, memory/state, and policy/guardrails.
- Design for performance, correctness, and cost: reduce p50/p95 latency, improve tail behavior, and optimize token/tool budgets.
- Integrate with leading LLM providers (e.g., OpenAI, Anthropic, Google Gemini) and internal evaluation frameworks to improve quality and predictability.
- Harden the platform with fault isolation, retries, timeouts, circuit-breaking, backpressure, and graceful degradation.
- Instrument deep observability (tracing, metrics, logs) and create playbooks/SLOs for high availability and on-call excellence.
- Collaborate closely with product, quality, and application teams to prioritize roadmap investments.
Requirements
- 8+ years of software engineering experience building production distributed systems or cloud-native applications.
- 1+ years of engineering management experience.
- BS/BA in Computer Science or related field, or equivalent practical experience.
- Strong coding skills in at least one of: Python, Go, Java, or C++, with focus on reliability, performance, and tests.
- Product-minded: prioritize customer impact, clear SLAs/SLOs, and pragmatic iteration.
- Ownership-driven with a positive, proactive attitude; comfortable leading projects and learning from experienced engineers.
- Experience operating services on Kubernetes and at least one major cloud (e.g., GCP, AWS, or Azure).
- Familiarity with event/streaming systems (e.g., Pub/Sub, Kafka), caching (e.g., Redis), and data stores for low-latency paths.
- Practical understanding of LLM/agents building blocks: tool/function calling, structured outputs, streaming, and model selection/routing.
- Strong observability and debugging skills: tracing (e.g., OpenTelemetry), metrics, dashboards, and production forensics.
- Background in one or more of these is a plus: policy/guardrails, multi-tenant isolation, rate-limiting, concurrency control, cost optimization.
Location & Office Policy
This role is hybrid: 4 days a week in one of Glean’s San Francisco Bay Area offices.
Compensation & Benefits
- Standard base salary range: $250,000 - $300,000 annually.
- Compensation may include variable compensation, equity, and benefits depending on role and experience.
- Benefits include Medical, Vision, and Dental coverage, generous time-off, opportunity to contribute to a 401(k), home office improvement stipend, annual education and wellness stipends, regular company events, and daily healthy lunches.
Other
Glean is committed to diversity and inclusion and does not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.