Senior Data Engineer
Rochester, MN, United States
Mayo Clinic
Join our innovative team at Mayo Clinic where we are shaping the future of healthcare through cutting-edge generative AI solutions. We are seeking a Senior Data Engineer with deep expertise in building scalable, secure, and high-performance data infrastructure to support the development and deployment of large language models (LLMs) and AI-powered applications. The ideal candidate will bring strong data engineering fundamentals, proficiency in Python and Bash, and advanced knowledge of Google Cloud Platform (GCP) services including BigQuery, Dataflow, Pub/Sub, and Cloud Storage. A deep understanding of data quality frameworks is essential—this includes designing and implementing strategies to ensure data accuracy, completeness, consistency, uniqueness, timeliness, and validity throughout the data lifecycle. Experience with real-time and batch ETL pipelines, data governance aligned with HIPAA standards, and infrastructure automation using Terraform is critical. Familiarity with tools like Vertex AI and the ability to integrate machine learning workflows into robust data pipelines will be key to enabling our AI-driven research and clinical solutions. Experience with the Cloud Healthcare API and healthcare data standards such as FHIR, HL7v2, and DICOM is a plus.
Develops and deploys data pipelines, integrations and transformations to support analytics and machine learning applications and solutions as part of an assigned product team using various open-source programming languages and vended software to meet the desired design functionality for products and programs. The position requires maintaining an understanding of the organization's current solutions, coding languages, tools, and regularly requires the application of independent judgment. May provide consultative services to departments/divisions and leadership committees. Demonstrated experience in designing, building, and installing data systems and how they are applied to the Department of Data & Analytics technology framework is required. Candidate will partner with product owners and Analytics and Machine Learning delivery teams to identify and retrieve data, conduct exploratory analysis, pipeline and transform data to help identify and visualize trends, build and validate analytical models, and translate qualitative and quantitative assessments into actionable insights.
During the selection process you may participate in a Codility test as well as an OnDemand (pre-recorded) interview that you can complete at your convenience. During the OnDemand interview, a question will appear on your screen, and you will have time to consider each question before responding. You will have the
opportunity to re-record your answer to each question — Mayo Clinic will only see the final recording. The complete interview will be reviewed by a Mayo Clinic staff member and you will be notified of next steps
This is a full time remote position within the United States. However the incumbent may be asked to work on campus 1 - 2 days per month, therefore preference is that incumbent lives within a reasonable driving distance of a Mayo Clinic campus.
Mayo Clinic will not sponsor or transfer visas for this position including F1 OPT STEM.
A Bachelor's degree in a relevant field such as engineering, mathematics, computer science, information technology, health science, or other analytical/quantitative field and a minimum of five years of professional or research experience in data visualization, data engineering, analytical modeling techniques; OR an Associate’s degree in a relevant field such as engineering, mathematics, computer science, information technology, health science, or other analytical/quantitative field and a minimum of seven years of professional or research experience in data visualization, data engineering, analytical modeling techniques. In-depth business or practice knowledge will also be considered.
Incumbent must have the ability to manage a varied workload of projects with multiple priorities and stay current on healthcare trends and enterprise changes. Interpersonal skills, time management skills, and demonstrated experience working on cross functional teams are required. Requires strong analytical skills and the ability to identify and recommend solutions and a commitment to customer service. The position requires excellent verbal and written communication skills, attention to detail, and a high capacity for learning and problem resolution.
Advanced experience in SQL is required. Strong Experience in scripting languages such as Python, JavaScript, PHP, C++ or Java & API integration is required. Experience in hybrid data processing methods (batch and streaming) such as Apache Spark, Hive, Pig, Kafka is required. Experience with big data, statistics, and machine learning is required. The ability to navigate linux and windows operating systems is required. Knowledge of workflow scheduling (Apache Airflow Google Composer), Infrastructure as code (Kubernetes, Docker) CI/CD (Jenkins, Github Actions) is preferred. Experience in DataOps/DevOps and agile methodologies is preferred. Experience with hybrid data virtualization such as Denodo is preferred. Working knowledge of Tableau, Power BI, SAS, ThoughtSpot, DASH, d3, React, Snowflake, SSIS, and Google Big Query is preferred.
Google Cloud Platform (GCP) certification is preferred.
Mayo Clinic is top-ranked in more specialties than any other care provider according to U.S. News & World Report. As we work together to put the needs of the patient first, we are also dedicated to our employees, investing in competitive compensation and comprehensive benefit plans – to take care of you and your family, now and in the future. And with continuing education and advancement opportunities at every turn, you can build a long, successful career with Mayo Clinic. You’ll thrive in an environment that supports innovation, is committed to ending racism and supporting diversity, equity and inclusion, and provides the resources you need to succeed.
Benefits Highlights
- Medical: Multiple plan options.
- Dental: Delta Dental or reimbursement account for flexible coverage.
- Vision: Affordable plan with national network.
- Pre-Tax Savings: HSA and FSAs for eligible expenses.
- Retirement: Competitive retirement package to secure your future.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow APIs Big Data BigQuery CI/CD Computer Science D3 Dataflow Data governance DataOps Data pipelines Data quality Data visualization DevOps DICOM Docker Engineering ETL GCP Generative AI GitHub Google Cloud Java JavaScript Jenkins Kafka Kubernetes Linux LLMs Machine Learning Mathematics Open Source PHP Pipelines Power BI Python React Research SAS Snowflake Spark SQL SSIS Statistics STEM Streaming Tableau Terraform Vertex AI
Perks/benefits: Career development Competitive pay Equity / stock options Flex hours Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.