Data Engineer
Munich
Hawk
Award-winning AML & CFT technology powered by explainable AI increases your risk coverage, helps you identify more crime, and reduces your false positives.About Us
Hawk is the leading provider of AI-supported anti-money laundering and fraud detection technology. Banks and payment providers globally are using Hawk’s powerful combination of traditional rules and explainable AI to improve the effectiveness of their AML compliance and fraud prevention by identifying more crime while maximizing efficiency by reducing false positives. With our solution, we are playing a vital role in the global fight against Money Laundering, Fraud, or the financing of terrorism. We offer a culture of mutual trust, support and passion – while providing individuals with opportunities to grow professionally and make a difference in the world.
Your Mission:
As a Data Engineer at Hawk, your mission is to design, build, and maintain the data infrastructure that powers insights, decision-making, and innovation across the organization. You will be a key contributor in building out our datalake and shaping the architectural foundation for high-quality, scalable data solutions. Your expertise will enable seamless access to high-quality datasets for reporting, analytics, and machine learning. You'll work in an inclusive, collaborative environment where your contributions will directly impact the success of our data-driven initiatives. From creating data pipelines to engaging in architectural discussions, you'll have the opportunity to lead, innovate, and grow while delivering critical value to our customers and stakeholders.
Key Responsibilities:
Build and maintain scalable data infrastructure: Design, implement, and optimize our datalake and associated pipelines to support reporting, analytics, and machine learning workloads.
Shape architecture and tooling: Lead discussions on data architecture, recommend tools and frameworks, and ensure the adoption of best practices for distributed data processing and orchestration.
Data preparation and quality: Prepare datasets for internal and external business reporting, ensuring data quality, consistency, and accessibility.
Machine learning readiness: Collaborate with data scientists to prepare and optimize datasets for machine learning and advanced analytics.
Collaborate across teams: Work closely with operations, data science, and business stakeholders to understand requirements and deliver solutions that align with their needs.
Drive innovation: Explore and implement new technologies and methods to optimize data storage, processing, and access.
Your Profile:
Educational background:
Bachelor’s or master’s degree in computer science or a related technical field.
Technical expertise:
Strong expertise in building data pipelines to support business critical reporting infrastructure.
Strong experience with cloud platforms (AWS, GCP, Azure).
Proficient in distributed data processing tools (e.g., Spark) and stream processing frameworks (e.g., Kafka, Flink).
Strong understanding of lake house architectures and supporting technologies (e.g., Delta Lake, Iceberg, Hudi), query optimization tools (e.g., Trino, Presto), and orchestration tools (e.g., Airflow)
Experience with database technologies (Elasticsearch, PostgreSQL). Knowledge of graph databases (e.g. Neo4j) is a plus.
Familiarity with reporting tools (e.g., Tableau, Power BI, Databricks Dashboards).
Advanced Python programming skills.
Hands-on experience with distributed machine learning frameworks and pipelines is a plus.
Proven experience:
4+ years in data engineering or a related role, with a track record of delivering business value through scalable data solutions.
Collaborative mindset:
Strong interpersonal and communication skills, enabling you to collaborate effectively across diverse teams.
Commitment to quality:
Passion for building robust, high-quality solutions with a focus on innovation and continuous improvement.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture AWS Azure Computer Science Databricks Data pipelines Data quality Elasticsearch Engineering Flink GCP Kafka Machine Learning Neo4j Pipelines PostgreSQL Power BI Python Spark Tableau
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.