2021 journal article

Ensuring Data Readiness for Quality Requirements with Help from Procedure Reuse

ACM JOURNAL OF DATA AND INFORMATION QUALITY, 13(3).

author keywords: Data and information quality; data integration in Big Data; data cleaning in Big Data; Big Data quality and analytics; Big Data quality in business process; Big Data quality management processes, frameworks and models
TL;DR: This research presents a meta-modelling architecture that automates the very labor-intensive and therefore time-heavy and therefore expensive process of manually cataloging and cataloging data in order to assess and improve the quality of data. (via Semantic Scholar)
Source: Web Of Science
Added: February 21, 2022

Assessing and improving the quality of data are fundamental challenges in Big-Data applications. These challenges have given rise to numerous solutions targeting transformation, integration, and cleaning of data. However, while schema design, data cleaning, and data migration are nowadays reasonably well understood in isolation, not much attention has been given to the interplay between standalone tools in these areas. In this article, we focus on the problem of determining whether the available data-transforming procedures can be used together to bring about the desired quality characteristics of the data in business or analytics processes. For example, to help an organization avoid building a data-quality solution from scratch when facing a new analytics task, we ask whether the data quality can be improved by reusing the tools that are already available, and if so, which tools to apply, and in which order, all without presuming knowledge of the internals of the tools, which may be external or proprietary.