Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Python @ 4
Spark @ 4
Machine Learning @ 4
Communication @ 7
FastAPI @ 4
API @ 4
LLM @ 4
Salesforce @ 4
AI @ 4
Data Pipelines @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Joining Collibra’s Unstructured AI Team
Work at the forefront of context engineering — shaping how AI systems retrieve, structure, and leverage context to deliver accurate, high-quality results at scale. Own end-to-end technical delivery of Unstructured AI systems — from feature prototype to stable production across enterprise environments. Build and scale full-stack systems that ingest, process, and enrich large volumes of unstructured content from distributed enterprise silos (PDFs, contracts, reports, and other document types). Collaborate with senior leaders and founders to understand complex business challenges and deliver solutions. Stay ahead of the curve by engaging with the latest developments in machine learning and AI, sharing knowledge and leading by example.
This is a hybrid role based in Collibra's New York office. The hybrid model requires working from the office at least two days each week.
Responsibilities
- Ship complex systems in ambiguous environments, balancing speed and precision in real production settings.
- Write and review production-grade backend code (Python, FastAPI).
- Build and deploy document-processing systems that handle large-scale, unstructured data environments.
- Integrate data from diverse enterprise data sources (e.g., SharePoint, Salesforce, internal APIs) to provide context for AI features.
- Partner across engineering, product, and sales teams to ensure alignment from prototype to rollout.
- Occasionally work with modern frontend development.
Requirements
- Strong proficiency in Python (data processing, API development, and integrations).
- Hands-on experience with LLM-based and AI-driven enrichment models (classification, entity extraction, deduplication, PII detection).
- Proven ability to deliver production-grade systems using Big Data frameworks (e.g., Spark) to handle data at scale.
- Solid understanding of data pipelines, microservice architecture, and API design.
- Experience ingesting and processing data from third-party enterprise sources (e.g., SharePoint/OneDrive, Salesforce, SaaS knowledge bases).
- Familiarity with metadata systems, data cataloging, or document AI workflows.
- Knowledge of model evaluation best practices.
- Experience with search relevance.
- Strong communication skills and the ability to work across technical and business teams.
- A bachelor’s degree or equivalent related working experience is required.
- This position is not eligible for visa sponsorship.
You Are
- Calm and structured in decision-making under tight timelines or ambiguity.
- Able to communicate clearly across engineering, product, and field teams.
- Experienced in spotting risks early and course-correcting without friction.
- Someone who cares deeply about data quality, precision, and governance.
Measures of Success
- Within the first month: develop a deep understanding of the product vision and unstructured data stack; ship your first set of end-to-end features.
- Within three months: take full ownership of technical delivery for key product areas and build robust capabilities for complex document processing.
- Within six months: drive development of enterprise-grade AI product features that solve for data at scale and architect high-performance pipelines and context engineering.
Compensation
The standard base salary range for this position is $204,000 - $255,000 per year. This role is not eligible for additional commission-based compensation. Salary offers are based on a combination of factors, including experience, skills, and location. In addition to base salary, Collibra offers bonus potential, equity for eligible roles, a Flex Fund monthly stipend, pension/401k plans, and more.
Benefits
Collibra offers a flexible benefits program including competitive compensation, health coverage, time off, and additional flexible offerings to support employees and loved ones. Learn more about Collibra’s benefits and DEI information at Collibra's careers pages.