Used Tools & Technologies
Not specified
Required Skills & Competences ?
Go @ 7 Grafana @ 3 Kubernetes @ 6 Prometheus @ 3 DevOps @ 7 Terraform @ 4 GCP @ 4 CI/CD @ 4 Hiring @ 4 AWS @ 4 Azure @ 4 Networking @ 4 SRE @ 4 OpenTelemetry @ 3Details
ClickHouse is expanding its cloud data platform across AWS, GCP, and Azure—adding new capabilities that connect and extend Postgres and ClickHouse for modern data workloads. We’re hiring a Senior SRE / Senior Infrastructure Engineer to own reliability, automation, and operations as these services scale globally. This is a hands-on, high-impact role where you’ll write code, shape architecture, and enable the broader engineering team to deliver with confidence and velocity.
Responsibilities
- Lead reliability and operations for ClickHouse’s Postgres integration — upgrades, patching, maintenance, and scaling.
- Design and implement automation for provisioning, deployments, and service lifecycle management across AWS, GCP, and Azure.
- Develop infrastructure-as-code using Terraform and modern CI/CD tooling to ensure consistent, repeatable deployments.
- Contribute Go-based tooling and services that improve automation, observability, and developer experience.
- Own observability and monitoring, ensuring robust alerting, metrics, and tracing across environments.
- Drive incident management and postmortem practices that strengthen reliability and learning loops.
- Collaborate cross-functionally with platform, networking, and product teams to improve service operability.
- Mentor and enable engineers, helping the team scale effectively as customer adoption grows.
Requirements
- 7+ years in SRE, DevOps, or infrastructure engineering, with a track record of running distributed, production-grade systems.
- Solid understanding of Postgres operations, scaling, and performance tuning.
- Deep hands-on experience across AWS, with exposure to GCP and Azure; comfortable navigating multi-cloud topologies.
- Proficient with Terraform, Kubernetes, and container-based infrastructure.
- Strong Go development skills (or willingness to write and own production Go code).
- Familiarity with observability tools such as Prometheus, Grafana, Loki, OpenTelemetry, or equivalents.
- Deep understanding of SLOs, incident response, and continuous improvement in service reliability.
- Founder’s mentality: hands-on, resourceful, autonomous, and focused on shipping impactful systems.
Compensation
The typical starting salary for this role in the US is $140,000 - $208,000 USD. The typical starting salary for this role in US Premium Markets (e.g., Los Angeles, San Francisco Bay Area, Seattle, New York City Metro Area) is $155,000 - $230,000 USD. Actual compensation may vary based on experience, location, and other factors.
Perks & Benefits
- Flexible work environment; ClickHouse is globally distributed and remote-friendly.
- Employer contributions towards healthcare.
- Equity via stock options for new hires.
- Flexible time off in the US and generous entitlements in other countries.
- $500 home office setup for remote employees.
- Opportunities for global gatherings and company-wide offsites.
About ClickHouse
Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is a fast-growing private cloud company focused on real-time analytics, data warehousing, observability, and AI workloads. The company serves over 2,000 customers and recently closed a $350M Series C financing. As part of the early team, you will help shape culture, architecture, and operational practices.
Equal Opportunity & Privacy
ClickHouse is an equal opportunity employer and provides information about applicant privacy in its privacy statement. If you have questions about compensation, contact [email protected].