Experience Inc. Jobs

Job Information

The University of Chicago Data Quality Engineer - JR28605-3800 in Chicago, Illinois

This job was posted by https://illinoisjoblink.illinois.gov : For more information, please see: https://illinoisjoblink.illinois.gov/jobs/12392945 Department

BSD CTD - User Services - GDC

About the Department

The Center for Translational Data Science (CTDS) at the University of Chicago is a research center whose mission is to develop the discipline of translational data science to impactful problems in biology, medicine, healthcare, and the environment. We envision a world in which researchers have ready access to the data needed and the tools required to make data driven discoveries that increase our scientific knowledge and improve the quality of life. We architect ecosystems of large-scale commons of research data, computing resources, applications, tools, and services for the broader research community to use data at scale to pursue scientific inquiry and accelerate discovery. Learn more at https://gdc.cancer.gov/, https://gen3.org/, https://stats.gen3.org/, and https://ctds.uchicago.edu/.

This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance.

Job Summary

The job works independently to perform a variety of activities relating to software support and/or development. Analyzes, designs, develops, debugs, and modifies computer code for end user applications, beta general releases, and production support. Guides development and implementation of applications, web pages, and user-interfaces using a variety of software applications, techniques, and tools. Solves complex problems in administration, maintenance, integration, and troubleshooting of code and application ecosystem currently in production.

The Data Quality Engineer is a problem solver with a background working in data integrity and testing to ensure high quality data and metadata is distributed to the cancer research community. This is an opportunity to elevate your career working with one of the world\'s largest collections of harmonized cancer genomic data. This role focuses on the Genomic Data Commons, which is at the forefront of both cutting edge research and production systems supporting cancer research. Your role will be as the lead engineer for data quality and integrity, joining a team of engineers developing innovative technologies in the pursuit of discovery through data-driven cancer research. You will focus on leading data quality efforts related to data integration, higher level data products, and distribution to the cancer research community, working across multiple teams to build and automate frameworks such as anomaly detection, reporting, and alerting to ensure data quality. You will gain expertise not only in the data itself, but the systems as well to interrogate the data and understand gaps in data quality. Data and metadata quality has a broad scope, so you are expected work collaboratively across teams to determine priorities and best methods for achieving objectives. Additionally, support for end users will be required through user communications and documentation.

Responsibilities

Drive the design of the data QA infrastructure and execution of testing protocols to validate pipelines, integrated datasets, and data products.

Use a combination of exploratory, regression, and automated testing to ensure data quality standards. Assess appropriate inclusion/exclusion of data based on defined data dictionary.

Assist in evaluation and development of data dictionaries and utilize data specification and code to validate data as it relates to quality.

Assist in data release planning and implementation based on stakeholder requirements and data availability.

Proactively identify potential data issues and downstream impact. Identify existing data iss es and perform research and root cause analyses to determine resolution. Work collaboratively with software engineers, bioinformaticians, and stakeholders to achieve and verify resolution.

Establish and maintain processes and standards to improve data quality assurance and implement efficiencies in data management.

Define measurements and metrics to conduct and present routine data reports to the project team and stakeholders.

Participate in data acquisition and integration planning efforts including data modeling, data dictionary definitions, and data harmonization pipeline development.

Develop a deep understanding of multiple genomic datasets and the technical data management software and processes of the underlying system.

Define data quality and integrity criteria and develop a comprehensive data quality management plan to lead key data QC efforts through team collaboration for all phases of the data management life cycle.

Contribute written knowledge and expertise to system documentation, user documentation, scientific manuscripts, reporting, grant proposals and reports, and presentation materials. Stay abreast of broad knowledge of existing and emerging technologies and QC tools in the cancer genomics space.

Use a deep understanding of the data, scientific goals

DirectEmployers