Data Quality Engineer

Shoreland, United States

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

University of Chicago

One of the world’s leading research universities, the University of Chicago inspires scholars to pursue field-defining research, while providing a transformative education for students.

View all jobs at University of Chicago

Apply now Apply later

Department

BSD CTD - User Services - GDC


About the Department

The Center for Translational Data Science (CTDS) at the University of Chicago is a research center whose mission is to develop the discipline of translational data science to impactful problems in biology, medicine, healthcare, and the environment. We envision a world in which researchers have ready access to the data needed and the tools required to make data driven discoveries that increase our scientific knowledge and improve the quality of life. We architect ecosystems of large-scale commons of research data, computing resources, applications, tools, and services for the broader research community to use data at scale to pursue scientific inquiry and accelerate discovery. Learn more at https://gdc.cancer.gov/, and https://gen3.org/, and https://stats.gen3.org/, and https://ctds.uchicago.edu/.


Job Summary

The Data Quality Engineer is a problem solver with a background working in data integrity and testing to ensure high quality data and metadata is distributed to the cancer research community. This is an opportunity to elevate your career working with one of the world's largest collections of harmonized cancer genomic data. This role focuses on the Genomic Data Commons, which is at the forefront of both cutting edge research and production systems supporting cancer research. Your role will be as an engineer for data quality and integrity, joining a team of engineers developing innovative technologies in the pursuit of discovery through data-driven cancer research. You will focus on data quality efforts related to data integration, higher level data products, and distribution to the cancer research community, working across multiple teams to build and automate frameworks such as anomaly detection, reporting, and alerting to ensure data quality. You will gain expertise not only in the data itself, but the systems as well to interrogate the data and understand gaps in data quality. Data and metadata quality has a broad scope, so you are expected work collaboratively across teams to determine priorities and best methods for achieving objectives. Additionally, support for end users will be required through user communications and documentation.

The job performs a variety of activities relating to software support and/or development. Provides analysis, design, development, debugging, and modification of computer code for end user applications, beta general releases, web pages, and production support. Troubleshoots problems using existing procedures to find a possible solution.

This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance.

Responsibilities

  • Drive the design of the data QA infrastructure and execution of testing protocols to validate pipelines, integrated datasets, and data products.

  • Use a combination of exploratory, regression, and automated testing to ensure data quality standards. Assess appropriate inclusion/exclusion of data based on defined data dictionary.

  • Assist in evaluation and development of data dictionaries and utilize data specification and code to validate data as it relates to quality.

  • Assist in data release planning and implementation based on stakeholder requirements and data availability. 

  • Proactively identify potential data issues and downstream impact. Identify existing data issues and perform research and root cause analyses to determine resolution. Work collaboratively with software engineers, bioinformaticians, and stakeholders to achieve and verify resolution.

  • Establish and maintain processes and standards to improve data quality assurance and implement efficiencies in data management.

  • Define measurements and metrics to conduct and present routine data reports to the project team and stakeholders.

  • Participate in data acquisition and integration planning efforts including data modeling, data dictionary definitions, and data harmonization pipeline development.

  • Develop a deep understanding of multiple genomic datasets and the technical data management software and processes of the underlying system.

  • Define data quality and integrity criteria and develop a comprehensive data quality management plan to lead key data QC efforts through team collaboration for all phases of the data management life cycle.

  • Contribute written knowledge and expertise to system documentation, user documentation, scientific manuscripts, reporting, grant proposals and reports, and presentation materials. Stay abreast of broad knowledge of existing and emerging technologies and QC tools in the cancer genomics space.

  • Use a deep understanding of the data, scientific goals and methodology, and underlying biological and translational concepts in assigned data commons and cloud environments to provide user support in high profile and troubling cases.

  • Coordinate on user management and issue resolution with functional teams, including, but not necessarily limited to, operations, development, design, bioinformatics, data science, project management, and information security.

  • Investigates, analyzes and resolves day-to-day technical problems using standard procedures.

  • Works with stakeholders to gather and analyze requirements for developmental programs. Receives a moderate level of guidance to design applications to meet University and business requirements.

  • Performs code testing on components and works to ensure that appropriate implementation standards are met. Evaluates design alternatives for development cost and solutions using various methods.

  • Supports and maintains existing applications. Works with web developers and responds to requests from users.

  • Performs other related work as needed.


Minimum Qualifications

Education:

Minimum requirements include a college or university degree in related field.


Work Experience:

Minimum requirements include knowledge and skills developed through 2-5 years of work experience in a related job discipline.


Certifications:

---

Preferred Qualifications

Education:

  • Bachelor's degree in Computer Science, Informatics, Bioinformatics, Biological Sciences, or related field.

Experience:

  • Minimum two (2) years of experience working in data quality and integrity engineering or testing.

  • Experience with data modeling, analysis, design, development, testing, and documentation.

  • Experience with data quality standards and practices.

  • Experience writing and executing data-centric tests cases to validate data.

  • Experience writing database queries, reading and understanding database queries, and utilizing other database artifacts.

  • Experience with Python.

  • Experience working with Linux/Unix systems and basic shell scripting.

  • Experience with biospecimen and clinical data curation.

  • Experience with advanced high-throughput genomic technologies.

  • Experience providing bioinformatics services or support.

  • Experience using NCI datasets (TCGA, TARGET, and CGCI).

  • Experience with graph and NoSQL databases.

Preferred Competencies

  • Ability to collaborate well across a team environment.

  • Ability and willingness to acquire new programming languages, statistical and computational methods, and background in research area.

  • Ability to prioritize and manage workload to meet critical project milestones and deadlines.

  • Confidentiality related to sensitive matters such as strategic initiatives, trade secrets, quiet periods, and scientific discoveries yet to be put in the public domain.

  • Ability to take a broad plan and break it into incremental tasks and oversee the completion of each task.

  • Ability to come into a team used to minimal supervision and oversight and ensure accountability for deliverables and outcomes.

  • Ability to persuade others to adapt new structures or systems to meet objectives.

Working Conditions

  • Preferred: Hybrid office/remote conditions with long stretches of time in front of a computer.

  • Optional: Fully remote.

Application Documents

  • Resume (required)

  • Cover Letter (preferred)


When applying, the document(s) MUST be uploaded via the My Experience page, in the section titled Application Documents of the application.


Job Family

Information Technology


Role Impact

Individual Contributor


Scheduled Weekly Hours

40


Drug Test Required

No


Health Screen Required

No


Motor Vehicle Record Inquiry Required

No


Pay Rate Type

Salary


FLSA Status

Exempt


Pay Range

$80,000.00 - $120,000.00

The included pay rate or range represents the University’s good faith estimate of the possible compensation offer for this role at the time of posting.


Benefits Eligible

Yes

The University of Chicago offers a wide range of benefits programs and resources for eligible employees, including health, retirement, and paid time off. Information about the benefit offerings can be found in the Benefits Guidebook.


Posting Statement

The University of Chicago is an equal opportunity employer and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender, gender identity, or expression, national or ethnic origin, shared ancestry, age, status as an individual with a disability, military or veteran status, genetic information, or other protected classes under the law. For additional information please see the University's Notice of Nondiscrimination.

 

Job seekers in need of a reasonable accommodation to complete the application process should call 773-702-5800 or submit a request via Applicant Inquiry Form.

 

All offers of employment are contingent upon a background check that includes a review of conviction history.  A conviction does not automatically preclude University employment.  Rather, the University considers conviction information on a case-by-case basis and assesses the nature of the offense, the circumstances surrounding it, the proximity in time of the conviction, and its relevance to the position.

 

The University of Chicago's Annual Security & Fire Safety Report (Report) provides information about University offices and programs that provide safety support, crime and fire statistics, emergency response and communications plans, and other policies and information. The Report can be accessed online at: http://securityreport.uchicago.edu. Paper copies of the Report are available, upon request, from the University of Chicago Police Department, 850 E. 61st Street, Chicago, IL 60637.

Apply now Apply later
Job stats:  0  0  0
Category: Engineering Jobs

Tags: Bioinformatics Biology Computer Science Data management Data QA Data quality Engineering Linux NoSQL Pipelines Python Research Security Shell scripting Statistics Testing

Perks/benefits: Career development Health care

Regions: Remote/Anywhere North America
Country: United States

More jobs like this