Data Engineer
EUR 40,800-85,200 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Kafka @ 4 IaC @ 4 Terraform @ 4 Python @ 4 SQL @ 4 ETL @ 4 Airflow @ 4 CI/CD @ 4 Algorithms @ 4 Distributed Systems @ 4 Machine Learning @ 4 Data Science @ 4 AWS @ 4 Communication @ 7 Data Engineering @ 4 ELT @ 4Details
Our Threat Intelligence team is dedicated to providing accurate and timely information on potential threats to our products. Our team of skilled professionals from various fields, including Data Science, Malware Research, Development, and Privacy, are focused on achieving a common goal by utilizing feeds, heuristics, algorithms, and machine learning.
Responsibilities
- Design, build, and manage scalable data pipelines using Python, SQL, and PySpark.
- Develop and maintain lakehouse architectures, with hands-on use of Apache Hudi for data versioning, upserts, and compaction.
- Implement efficient ETL/ELT processes for both batch and real-time data ingestion.
- Optimize data storage and query performance across large datasets (partitioning, indexing, compaction).
- Ensure data quality, governance, and lineage, integrating validation and monitoring into pipelines.
- Work with cloud-native services (preferably AWS – S3, Athena, EMR) to support modern data workflows.
- Collaborate closely with data scientists, analysts, and platform engineers to deliver reliable data infrastructure.
Requirements
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 2+ years of experience as a Data Engineer, working with large-scale distributed systems.
- Proven expertise in Lakehouse architecture and Apache Hudi in production environments.
- Experience with Airflow, Kafka, or streaming data pipelines.
- Strong programming skills in Python and PySpark.
- Comfortable working in a cloud-based environment (preferably AWS).
- Strong communication and collaboration skills.
Nice To Have:
- Knowledge of CI/CD, Infrastructure as Code (IaC) like Terraform.
- Exposure to data cataloging tools (e.g., Glue Data Catalog, Amundsen).
- Interest or experience in cybersecurity or secure data design.