Measuring and improving data quality in an organisation or in a group of interacting organisations is a complex task. Several methodologies have been developed in the past, providing a basis for the definition of a data quality programme that guarantees high data quality levels. Since the main limitation of existing approaches is their specialisation on specific issues or contexts, this paper presents a Comprehensive Data Quality (CDQ) methodology. The main aim of the CDQ methodology is the integration and enhancement of the phases, techniques and tools proposed by previous approaches. In particular, the CDQ methodology is conceived to be at the same time complete, flexible and simple to apply. Completeness is achieved by considering an existing techniques and tools and integrating them in a framework that can work in any organisation. The methodology is flexible, since it supports the user in the selection of the most suitable techniques and tools within each phase and in any context. Finally, CDQ is simple, since it is organised in phases and each phase is characterised by a specific goal and a set of techniques to apply. The methodology is explained by means of a running example and significant cases of its application are reported
Batini, C., Cabitza, F., Cappiello, C., Francalanci, C. (2008). A comprehensive data quality methodology for web and structured data. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING AND APPLICATIONS, 1(3), 205-218 [10.1504/IJICA.2008.019688].
A comprehensive data quality methodology for web and structured data
BATINI, CARLO;CABITZA, FEDERICO ANTONIO NICCOLO' AMEDEO;
2008
Abstract
Measuring and improving data quality in an organisation or in a group of interacting organisations is a complex task. Several methodologies have been developed in the past, providing a basis for the definition of a data quality programme that guarantees high data quality levels. Since the main limitation of existing approaches is their specialisation on specific issues or contexts, this paper presents a Comprehensive Data Quality (CDQ) methodology. The main aim of the CDQ methodology is the integration and enhancement of the phases, techniques and tools proposed by previous approaches. In particular, the CDQ methodology is conceived to be at the same time complete, flexible and simple to apply. Completeness is achieved by considering an existing techniques and tools and integrating them in a framework that can work in any organisation. The methodology is flexible, since it supports the user in the selection of the most suitable techniques and tools within each phase and in any context. Finally, CDQ is simple, since it is organised in phases and each phase is characterised by a specific goal and a set of techniques to apply. The methodology is explained by means of a running example and significant cases of its application are reportedI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.