Senior Data Engineer (AI and ML frameworks)

Warsaw, Masovian Voivodeship, Poland

Full Time Senior-level / Expert EUR 83K - 155K * ^est.

Sigma Software

Sigma Software is multinational IT company that provides the custom software development solutions. Become one of us!

View all jobs at Sigma Software

Apply now Apply later

Posted 2 weeks ago

Company Description

We are looking for a talented Senior Data Engineer with a strong background in developing or contributing to applications based on microservices using a Kappa architecture. The project aims to unify data sourced from different EHR systems in the healthcare domain, using the FHIR data format.

CUSTOMER
Our client is a leading analytics company operating at the intersection of technology, artificial intelligence, and big data. They support manufacturers and retailers in the fast-moving consumer goods sector, helping them better understand market dynamics, uncover consumer behavior insights, and make data-driven business decisions.

PROJECT
The project aims to unify data sourced from various EHR systems in the healthcare domain using the FHIR data format. The company’s proprietary technology platform combines high-quality data, deep industry expertise, and advanced predictive algorithms built over decades of experience in the field.

Job Description

Data Standardization and Transformation:

Convert diverse data structures from various EHR systems into a unified format based on FHIR standards

Map and normalize incoming data to the FHIR data model, ensuring consistency and completeness

Kafka Integration:

Consume and process events from the Kafka stream produced by the Data Writer Module

Deserialize and validate incoming data to ensure adherence to required standards

Data Segmentation:

Separate data streams for warehousing and AI model training, applying specific preprocessing steps for each purpose

Prepare and validate data for storage and machine learning model training

Error Handling and Logging:

Implement robust error handling mechanisms to track and resolve data mapping issues

Maintain detailed logs for auditing and troubleshooting purposes

Data Ingestion and Processing:

Use LLMs to extract structured data from EHRs, research articles, and clinical notes

Ensure semantic consistency and interoperability during data ingestion

Knowledge Graph Construction:

Integrate extracted data into a knowledge graph, representing entities and relationships for semantic data integration

Implement contextual understanding and querying of complex relationships within the knowledge graph (KG)

Advanced Predictive Modeling:

Leverage KGs and LLMs to enhance data interoperability and predictive analytics

Develop frameworks for contextualized insights and personalized medicine recommendations

Feedback Loop:

Continuously update the knowledge graph with new data using LLMs, ensuring up-to-date and relevant insights.

Work Closely with Cross-Functional Teams

Collaborate with data scientists, AI specialists, and software engineers to design and implement data processing solutions

Communicate eﬀectively with stakeholders to align on goals and deliverables

Contribute to Engineering Culture:

Foster a culture of innovation, collaboration, and continuous improvement within the engineering team

Qualifications

Deep understanding of patterns and software development practices for event-driven architectures

Hands-on experience with stateful stream data processing solutions (Kafka or similar streaming platforms)

Strong knowledge of data serialization/deserialization using various data formats (at minimum JSON and Avro), and integration with schema registries

Proven Python software development expertise, with experience in data processing and integration (most of the software is written in Python)

Practical experience building end-to-end solutions with Apache Flink or a similar platform

Experience with containerization and orchestration using Kubernetes (K8s) and Helm, especially on Google Kubernetes Engine (GKE)

Familiarity with Google Cloud Platform (GCP) or a similar cloud platform

Hands-on experience implementing data quality solutions for schema-on-read or schema-less data

Hands-on experience integrating with Apache Kafka, particularly the Confluent Platform

Familiarity with AI and ML frameworks

Proficiency in SQL and experience with both relational and NoSQL databases
Experience with graph databases like Neo4j or RDF-based systems
Experience in the healthcare domain and familiarity with healthcare standards such as FHIR and HL7 for data interoperability

Deep understanding of patterns and software development practices for event-driven architectures

Hands-on experience with stateful stream data processing solutions (Kafka or similar streaming platforms)

Strong knowledge of data serialization/deserialization using various data formats (at minimum JSON and Avro), and integration with schema registries

Proven Python software development expertise, with experience in data processing and integration (most of the software is written in Python)

Practical experience building end-to-end solutions with Apache Flink or a similar platform

Experience with containerization and orchestration using Kubernetes (K8s) and Helm, especially on Google Kubernetes Engine (GKE)

Familiarity with Google Cloud Platform (GCP) or a similar cloud platform

Hands-on experience implementing data quality solutions for schema-on-read or schema-less data

Hands-on experience integrating with Apache Kafka, particularly the Confluent Platform

Familiarity with AI and ML frameworks

Proficiency in SQL and experience with both relational and NoSQL databases

Experience with graph databases like Neo4j or RDF-based systems

Experience in the healthcare domain and familiarity with healthcare standards such as FHIR and HL7 for data interoperability

WOULD BE A PLUS:

Experience with web data scraping

Additional Information

PERSONAL PROFILE

Strong problem-solving skills, with the ability to design innovative solutions for complex data integration and processing challenges

Excellent communication skills, with the ability to articulate complex technical concepts and work eﬀectively with various stakeholders

Commitment to improving healthcare through data-driven solutions and technology

Stay abreast of the latest technologies and industry trends while continually improving your skills and knowledge

Ability to work in a collaborative environment, valuing diverse perspectives and contributing to a positive team culture

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 3 0 0

Categories: Deep Learning Jobs Engineering Jobs Machine Learning Jobs

Tags: Architecture Avro Big Data Data quality Engineering Flink GCP Google Cloud Helm HL7 JSON Kafka Kubernetes LLMs Machine Learning Microservices Model training Neo4j NoSQL Predictive modeling Python RDF Research SQL Streaming

Perks/benefits: Career development Team events

Region: Europe

Country: Poland

More jobs like this

« Back to job search To the top ↑

Explore more career opportunities

Find even more open roles below ordered by popularity of job title or skills/products/technologies used.

Senior Data Engineer (AI and ML frameworks)

Warsaw, Masovian Voivodeship, Poland

Full Time Senior-level / Expert EUR 83K - 155K * ^est.

Sigma Software

Company Description

Job Description

Qualifications

Additional Information

More jobs like this

Azure data-engineer

Senior Analytics Developer

Research Engineer, Agentic Privacy

Middle Python Engineer (LLM focus)

Software Development Engineer, ML Navigators

Software Development Engineer, ML Navigators

Senior Python Engineer (with DevOps & ML/LLM experience)

Senior Python Engineer (with DevOps & ML/LLM experience)

Data Platform Security Engineer (x/f/m)

Senior Data Platform Engineer

Explore more career opportunities

Senior Data Engineer (AI and ML frameworks)

Warsaw, Masovian Voivodeship, Poland

Full Time Senior-level / Expert EUR 83K - 155K * est.

Sigma Software

Company Description

Job Description

Qualifications

Additional Information

More jobs like this

Azure data-engineer

Senior Analytics Developer

Research Engineer, Agentic Privacy

Middle Python Engineer (LLM focus)

Software Development Engineer, ML Navigators

Software Development Engineer, ML Navigators

Senior Python Engineer (with DevOps & ML/LLM experience)

Senior Python Engineer (with DevOps & ML/LLM experience)

Data Platform Security Engineer (x/f/m)

Senior Data Platform Engineer

Explore more career opportunities

Full Time Senior-level / Expert EUR 83K - 155K * ^est.