Senior Data Management Professional: Automation Engineer – Entities
Used Tools & Technologies
Machine LearningRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Kafka @ 7
Python @ 7
Scala @ 7
SQL @ 7
Spark @ 7
Java @ 7
Airflow @ 7
CI/CD @ 6
AWS @ 4
Azure @ 4
Communication @ 4
Data Engineering @ 4
Git @ 6
Data Analysis @ 4
Databricks @ 3
LLM @ 4
Observability @ 4
AI @ 4
Profiling @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Bloomberg Data powers products with rich, timely, and highly contextualized information. The Entities Data Management Team owns core entity data that underpins Bloomberg’s financial products, including corporate hierarchies, risk attribution, and issuer relationships across public and private markets. The team is modernizing how this data is sourced, processed, and governed—ingesting from structured third-party feeds, unstructured documents, and internal systems. The role focuses on building scalable, automated pipelines, robust governance frameworks, and a decision engine to arbitrate competing inputs and determine the most accurate values for publishing.
Responsibilities
- Design and build the data arbitration/decision engine to resolve conflicts across multiple data sources and determine which values to publish.
- Drive standardization and automation of ingestion pipelines across structured, unstructured, and internal sources.
- Profile datasets, conduct data analysis to identify quality gaps and inconsistencies, and recommend process improvements.
- Implement data lineage, observability, and monitoring frameworks to ensure transparency, traceability, and reliability.
- Collaborate with Product Managers, Engineering, and cross-functional data teams to define and evolve platform requirements and technical architecture.
- Apply a data-product mindset balancing engineering efficiency with data quality, client needs, and maintainability.
- Support integration of AI/LLM-based tools as part of the data processing and enrichment strategy.
Requirements
- 4+ years of experience in data engineering, data architecture, or data automation roles (years used as a guide).
- Experience working with financial data, specifically reference or entity/company data domains.
- Strong proficiency in a programming language (examples given: Python, Java, Scala) and modern data tooling (examples: Spark, Airflow, Kafka).
- Strong SQL skills for data transformation, validation, and reconciliation.
- Demonstrated experience working with large-scale datasets and multi-source data arbitration, normalization, and conflict resolution across heterogeneous datasets.
- Deep understanding of data governance, quality frameworks, and metadata management.
- Experience with data profiling, validation techniques, and implementing data lineage and observability.
- Proven ability to work independently and cross-functionally in a fast-evolving environment, with excellent communication skills to explain technical decisions to varied stakeholders.
- Experience building decision engines using rules-based logic and/or AI/ML or LLM-based models.
Nice to have
- Familiarity with frameworks such as DCAM or DAMA-DMBOK.
- Experience with AWS and/or Azure for cloud-native data processing and storage.
- Proficiency with Git and CI/CD pipelines for production-grade deployments.
- Familiarity with cloud data services (examples: S3, EMR, Glue, ADLS, Data Factory, Databricks).
- Experience implementing data observability tools (examples: Monte Carlo, OpenLineage, or custom solutions).
Benefits
- Salary range: 110000 - 190000 USD annually plus benefits and bonus (actual compensation may vary by location, experience, and skills).
- Comprehensive benefits package (medical, dental, vision, 401(k)+match, paid time off, holidays, disability, life insurance, wellness programs, etc.).
- Opportunity to shape architecture and decision logic for foundational datasets and influence ingestion, arbitration, and publishing at scale.
For more information or to apply, see the company posting links provided in the original job listing.