Data Quality Engineer
Shoreland
Full Time Senior-level / Expert USD 98K - 137K
University of Chicago
One of the world’s leading research universities, the University of Chicago inspires scholars to pursue field-defining research, while providing a transformative education for students.Department
BSD CTD - User Services - GDC
About the Department
This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance.
Job Summary
The Data Quality Engineer is a problem solver with a background working in data integrity and testing to ensure high quality data and metadata is distributed to the cancer research community. This is an opportunity to elevate your career working with one of the world's largest collections of harmonized cancer genomic data. This role focuses on the Genomic Data Commons, which is at the forefront of both cutting edge research and production systems supporting cancer research. Your role will be as the lead engineer for data quality and integrity, joining a team of engineers developing innovative technologies in the pursuit of discovery through data-driven cancer research. You will focus on leading data quality efforts related to data integration, higher level data products, and distribution to the cancer research community, working across multiple teams to build and automate frameworks such as anomaly detection, reporting, and alerting to ensure data quality. You will gain expertise not only in the data itself, but the systems as well to interrogate the data and understand gaps in data quality. Data and metadata quality has a broad scope, so you are expected work collaboratively across teams to determine priorities and best methods for achieving objectives. Additionally, support for end users will be required through user communications and documentation.
Responsibilities
Drive the design of the data QA infrastructure and execution of testing protocols to validate pipelines, integrated datasets, and data products.
Use a combination of exploratory, regression, and automated testing to ensure data quality standards. Assess appropriate inclusion/exclusion of data based on defined data dictionary.
Assist in evaluation and development of data dictionaries and utilize data specification and code to validate data as it relates to quality.
Assist in data release planning and implementation based on stakeholder requirements and data availability.
Proactively identify potential data issues and downstream impact. Identify existing data issues and perform research and root cause analyses to determine resolution. Work collaboratively with software engineers, bioinformaticians, and stakeholders to achieve and verify resolution.
Establish and maintain processes and standards to improve data quality assurance and implement efficiencies in data management.
Define measurements and metrics to conduct and present routine data reports to the project team and stakeholders.
Participate in data acquisition and integration planning efforts including data modeling, data dictionary definitions, and data harmonization pipeline development.
Develop a deep understanding of multiple genomic datasets and the technical data management software and processes of the underlying system.
Define data quality and integrity criteria and develop a comprehensive data quality management plan to lead key data QC efforts through team collaboration for all phases of the data management life cycle.
Contribute written knowledge and expertise to system documentation, user documentation, scientific manuscripts, reporting, grant proposals and reports, and presentation materials. Stay abreast of broad knowledge of existing and emerging technologies and QC tools in the cancer genomics space.
Use a deep understanding of the data, scientific goals and methodology, and underlying biological and translational concepts in assigned data commons and cloud environments to provide user support in high profile and troubling cases.
Coordinate on user management and issue resolution with functional teams, including, but not necessarily limited to, operations, development, design, bioinformatics, data science, project management, and information security.
Designs new systems, features, and tools. Solves complex problems and identifies opportunities for technical improvement and performance optimization. Reviews and tests code to ensure appropriate standards are met.
Utilizes technical knowledge of existing and emerging technologies, including public cloud offerings from Amazon Web Services, Microsoft Azure, and Google Cloud.
Performs other related work as needed.
Minimum Qualifications
Education:
Minimum requirements include a college or university degree in related field.
Work Experience:
Certifications:
---
Preferred Qualifications
Education:
Bachelor's degree in Computer Science, Informatics, Bioinformatics, Biological Sciences, or related field.
Masters or doctoral degree in Computer Science, Informatics, Bioinformatics, Biological Sciences, or related field highly preferred.
Experience:
Experience working in data quality and integrity engineering or testing.
Experience with data modeling, analysis, design, development, testing, and documentation.
Experience with data quality standards and practices.
Experience writing and executing data-centric tests cases to validate data.
Experience writing database queries, reading and understanding database queries, and utilizing other database artifacts.
Experience with Python.
Experience working with Linux/Unix systems and basic shell scripting.
Experience with biospecimen and clinical data curation.
Experience with advanced high-throughput genomic technologies.
Experience providing bioinformatics services or support.
Experience using NCI datasets (TCGA, TARGET, and CGCI).
Experience with graph and NoSQL databases.
Preferred Competencies
Ability to lead across a collaborative team environment.
Ability and willingness to acquire new programming languages, statistical and computational methods, and background in research area.
Ability to prioritize and manage workload to meet critical project milestones and deadlines.
Confidentiality related to sensitive matters such as strategic initiatives, trade secrets, quiet periods, and scientific discoveries yet to be put in the public domain.
Ability to take a broad plan and break it into incremental tasks and oversee the completion of each task.
Ability to come into a team used to minimal supervision and oversight and ensure accountability for deliverables and outcomes.
Ability to persuade others to adapt new structures or systems to meet objectives.
Ability to gain the trust of management to gain the authority to successfully coordinate the team.
Working Conditions
Office environment.
Application Documents
Resume (required)
Cover Letter (preferred)
When applying, the document(s) MUST be uploaded via the My Experience page, in the section titled Application Documents of the application.
Job Family
Role Impact
Scheduled Weekly Hours
Drug Test Required
Health Screen Required
Motor Vehicle Record Inquiry Required
Pay Rate Type
FLSA Status
Pay Range
The included pay rate or range represents the University’s good faith estimate of the possible compensation offer for this role at the time of posting.
Benefits Eligible
The University of Chicago offers a wide range of benefits programs and resources for eligible employees, including health, retirement, and paid time off. Information about the benefit offerings can be found in the Benefits Guidebook.
Posting Statement
The University of Chicago is an Affirmative Action/Equal Opportunity/Disabled/Veterans and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender, gender identity, national or ethnic origin, age, status as an individual with a disability, military or veteran status, genetic information, or other protected classes under the law. For additional information please see the University's Notice of Nondiscrimination.
Staff Job seekers in need of a reasonable accommodation to complete the application process should call 773-702-5800 or submit a request via Applicant Inquiry Form.
We seek a diverse pool of applicants who wish to join an academic community that places the highest value on rigorous inquiry and encourages a diversity of perspectives, experiences, groups of individuals, and ideas to inform and stimulate intellectual challenge, engagement, and exchange.
All offers of employment are contingent upon a background check that includes a review of conviction history. A conviction does not automatically preclude University employment. Rather, the University considers conviction information on a case-by-case basis and assesses the nature of the offense, the circumstances surrounding it, the proximity in time of the conviction, and its relevance to the position.
The University of Chicago's Annual Security & Fire Safety Report (Report) provides information about University offices and programs that provide safety support, crime and fire statistics, emergency response and communications plans, and other policies and information. The Report can be accessed online at: http://securityreport.uchicago.edu. Paper copies of the Report are available, upon request, from the University of Chicago Police Department, 850 E. 61st Street, Chicago, IL 60637.
Tags: AWS Azure Bioinformatics Biology Computer Science Data management Data QA Data quality Engineering GCP Google Cloud Linux NoSQL Pipelines Python Research Security Shell scripting Statistics Testing
Perks/benefits: Career development Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.