This article investigates the evolution of data quality issues from traditional structured data managed in relational databases to Big Data. In particular, the paper examines the nature of the relationship between Data Quality and several research coordinates that are relevant in Big Data, such as the variety of data types, data sources and application domains, focusing on maps, semi-structured texts, linked open data, sensor &sensor networks and official statistics. Consequently a set of structural characteristics is identified and a systematization of the a posteriori correlation between them and quality dimensions is provided. Finally, Big Data quality issues are considered in a conceptual framework suitable to map the evolution of the quality paradigm according to three core coordinates that are significant in the context of the Big Data phenomenon: the data type considered, the source of data, and the application domain. Thus, the framework allows ascertaining the relevant changes in data quality emerging with the Big Data phenomenon, through an integrative and theoretical literature review.
Batini, C., Rula, A., Scannapieco, M., Viscusi, G. (2015). From data quality to big data quality. JOURNAL OF DATABASE MANAGEMENT, 26(1), 60-82 [10.4018/JDM.2015010103].
From data quality to big data quality
BATINI, CARLOPrimo
;RULA, ANISASecondo
;VISCUSI, GIANLUIGIUltimo
2015
Abstract
This article investigates the evolution of data quality issues from traditional structured data managed in relational databases to Big Data. In particular, the paper examines the nature of the relationship between Data Quality and several research coordinates that are relevant in Big Data, such as the variety of data types, data sources and application domains, focusing on maps, semi-structured texts, linked open data, sensor &sensor networks and official statistics. Consequently a set of structural characteristics is identified and a systematization of the a posteriori correlation between them and quality dimensions is provided. Finally, Big Data quality issues are considered in a conceptual framework suitable to map the evolution of the quality paradigm according to three core coordinates that are significant in the context of the Big Data phenomenon: the data type considered, the source of data, and the application domain. Thus, the framework allows ascertaining the relevant changes in data quality emerging with the Big Data phenomenon, through an integrative and theoretical literature review.File | Dimensione | Formato | |
---|---|---|---|
Batini_et_Al-JDM.pdf
accesso aperto
Descrizione: Articolo principale
Tipologia di allegato:
Author’s Accepted Manuscript, AAM (Post-print)
Dimensione
450.69 kB
Formato
Adobe PDF
|
450.69 kB | Adobe PDF | Visualizza/Apri |
01j-Batini2015-editoriale.pdf
Solo gestori archivio
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Dimensione
592.95 kB
Formato
Adobe PDF
|
592.95 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.