Healthcare Data Engineer

CA, San Francisco, United States of America

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Applications have closed

MyOme

MyOme is a clinical whole genome analysis platform company helping families understand their risk for inherited diseases. As a leader in polygenic modeling, MyOme leverages the power of the whole genome for a lifetime of actionable insights.

View all jobs at MyOme

Find more jobs like this Jobs in the United States

Posted 2 months ago

We are seeking a Healthcare Data Engineer to architect, develop, and scale pipelines that harmonize and integrate the EHR data across different datasets. In this hands-on role, you will design and maintain high-throughput ETL workflows, apply standards such as HL7 FHIR, OMOP, and SNOMED to guarantee interoperability, and collaborate with bioinformatics, clinical, product, and engineering teams to deliver secure, research-ready data for our expanding disease predicting pipeline.

Key Responsibilities

Data Standardization & Interoperability

Map heterogeneous data to HL7 FHIR, OMOP, SNOMED CT, ICD-10/11, LOINC, RxNorm, and related vocabularies.
Maintain high fidelity and minimal data loss through ontology-driven mapping and validation.

Design & Implement ETL Pipelines

Work with the engineering team to improve the workflows to ingest, de-identify, and harmonize clinical data from various EHR systems.
Integrate structured and unstructured data (clinical notes, imaging, lab results) into a unified schema.

Cloud Architecture & Scalability

Work with the engineering team to maintain a secure, cloud-based infrastructure capable of supporting petabyte-scale datasets.
Leverage distributed computing frameworks (e.g., Apache Spark, Databricks) for high-throughput data processing.

Privacy & Security

Ensure compliance with HIPAA, GDPR, and other applicable regulations.
Implement federated data-sharing patterns and robust encryption for data in transit and at rest.

Data Quality & Validation

Work with the engineering team to build automated anomaly-detection pipelines for real-time data quality checks.

Collaboration & Communication

Work with cross-functional teams (engineering, product, clinical, lab) to set timelines and roadmaps.
Share daily progress and surface blockers early while following established best practices in healthcare data engineering.

Skills and Experience

PhD in CS, Bioinformatics, or a related field; OR5+ years of experience in data engineering with at least 2+ years specific to healthcare or clinical informatics.
Hands-on knowledge of HL7 FHIR, OMOP, SNOMED CT, and other healthcare data standards.
Proficiency in SQL and one or more programming languages (Python, C+).
Experience with cloud platforms (AWS, Azure, or GCP) and distributed frameworks (Spark, Databricks).
Familiarity with privacy-preserving architectures, data encryption, and federated data models.
Demonstrated success in building ETL pipelines .
Strong communication skills to translate complex data requirements into actionable plans for cross-functional teams.
Nice to have: Familiarity with genomic data, and/or NLP for clinical text.

Find more jobs like this Jobs in the United States

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 2 0 0

Category: Engineering Jobs

Tags: Architecture AWS Azure Bioinformatics Databricks Data quality Engineering ETL GCP HL7 LOINC NLP OMOP PhD Pipelines Privacy Python Research RxNorm Security SNOMED Spark SQL Unstructured data