Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 2 Grafana @ 3 Kubernetes @ 3 Prometheus @ 3 DevOps @ 3 Terraform @ 3 Python @ 3 GCP @ 3 CI/CD @ 3 Datadog @ 3 Distributed Systems @ 3 AWS @ 3 Azure @ 3 Helm @ 3 Networking @ 3 SRE @ 3 Debugging @ 3Details
Collibra’s Engineering team builds infrastructure and platforms that help large organizations manage and deploy data products. As a Platform Engineer for Unstructured AI you will design, implement, and maintain the infrastructure and deployment patterns customers use to run the Unstructured AI application in their cloud environments. This is a hybrid role based in our New York office (work from the office at least two days each week).
Responsibilities
- Design, implement, and maintain infrastructure blueprints enabling customers to deploy the Unstructured AI application in customer cloud environments (AWS, GCP, Azure).
- Codify infrastructure as code (Terraform), creating modular, secure, and reproducible configurations to support multiple deployment scenarios (bring-your-own VPC, custom networking, varied customer setups).
- Manage and evolve the deployment stack, including Kubernetes and Helm-based deployments, and supporting components such as databases, secrets, authentication, and SSO.
- Set up and maintain monitoring, alerting, and observability systems (Datadog, Prometheus, Grafana) for operational visibility and proactive incident detection.
- Participate in incident response and troubleshooting for customer-deployed environments, using observability data to diagnose and resolve issues.
- Collaborate with application teams to implement Python code and configuration changes that align infrastructure and deployment patterns.
- Build tools and services to automate product updates, simplifying upgrades and ongoing maintenance of the Unstructured AI application.
- Work closely with product and field engineering teams to ensure reliable, scalable, and easily deployable infrastructure for customers.
Requirements
- 2+ years of engineering or technical experience with exposure to infrastructure, cloud, or platform work.
- Strong experience with Terraform or similar infrastructure-as-code tools and modular deployment best practices.
- Good understanding of Kubernetes, Helm, and container orchestration.
- Experience deploying and managing workloads in at least one major cloud platform (AWS, GCP, or Azure).
- Python application code experience to align infrastructure and deployment patterns.
- Familiarity with networking, IAM, and security configurations, including VPC design and private networking.
- Experience with deployment automation and CI/CD pipelines, including container image registries (ECR, GCR, ACR).
- Hands-on experience with monitoring, observability, and incident response; debugging distributed systems using tools like Datadog, Prometheus, or Grafana.
- Background in DevOps, SRE, or platform engineering for complex, customer-deployed applications.
- A bachelor’s degree or equivalent related working experience is required.
- This position is not eligible for visa sponsorship.
Measures of Success
- Month 1: Understand the Unstructured AI deployment architecture across clouds and begin contributing small improvements to Terraform and Helm configurations.
- Month 3: Lead enhancements that improve reliability, visibility, and maintainability of customer deployments across environments.
- Month 6: Drive major automation initiatives that simplify ongoing operations, upgrades, and scalability for the Unstructured AI platform.
Compensation
- Base salary range: $140,000 - $175,000 per year. Salary offers are based on experience, skills, and location.
- Additional compensation: equity ownership, bonus potential, a Flex Fund monthly stipend, pension/401(k) plans, and more.
Benefits
Collibra offers a flexible benefits program including competitive compensation, health coverage, time off, and programs for inclusion and belonging. Learn more about Collibra’s benefits and DEI resources via the company careers pages.