Measuring and improving data quality in an organization or in a group of interacting organizations is a complex task. Several methodologies have been developed in the past providing a basis for the definition of a complete data quality program applying assessment and improvement techniques in order to guarantee high data quality levels. Since the main limitation of existing approaches is their specialization on specific issues or contexts, this paper presents the Comprehensive Data Quality (CDQ) methodology that aims at integrating and enhancing the phases, techniques and tools proposed by previous approaches. CDQ methodology is conceived to be at the same time complete, flexible and simple to apply. Completeness is achieved by considering existing techniques and tools and integrating them in a framework that can work in both intra and inter organizational contexts, and can be applied to all types of data. The methodology is flexible since it supports the user in the selection of the most suitable techniques and tools within each phase and in any context. Finally, CDQ is simple since It is organized in phases and each phase is char-acterized by a specific goal and techniques to apply. The methodology is explained by means of a running example. © 2006 IEEE.

Batini, C., Cabitza, F., Cappiello, C., Francalanci, C. (2006). A comprehensive data quality methodology for web and structured data. In ICDIM'06: Proceedings of the First IEEE International Conference on Digital Information Management (pp.448-456). IEEE [10.1109/ICDIM.2007.369236].

A comprehensive data quality methodology for web and structured data

Batini, C
;
Cabitza, FA
;
2006

Abstract

Measuring and improving data quality in an organization or in a group of interacting organizations is a complex task. Several methodologies have been developed in the past providing a basis for the definition of a complete data quality program applying assessment and improvement techniques in order to guarantee high data quality levels. Since the main limitation of existing approaches is their specialization on specific issues or contexts, this paper presents the Comprehensive Data Quality (CDQ) methodology that aims at integrating and enhancing the phases, techniques and tools proposed by previous approaches. CDQ methodology is conceived to be at the same time complete, flexible and simple to apply. Completeness is achieved by considering existing techniques and tools and integrating them in a framework that can work in both intra and inter organizational contexts, and can be applied to all types of data. The methodology is flexible since it supports the user in the selection of the most suitable techniques and tools within each phase and in any context. Finally, CDQ is simple since It is organized in phases and each phase is char-acterized by a specific goal and techniques to apply. The methodology is explained by means of a running example. © 2006 IEEE.
slide + paper
data quality, structured data
English
ICDIM'06: Proceedings of the First IEEE International Conference on Digital Information Management, December 6-8
2006
ICDIM'06: Proceedings of the First IEEE International Conference on Digital Information Management
978-1-4244-0682-1
2006
448
456
4221928
none
Batini, C., Cabitza, F., Cappiello, C., Francalanci, C. (2006). A comprehensive data quality methodology for web and structured data. In ICDIM'06: Proceedings of the First IEEE International Conference on Digital Information Management (pp.448-456). IEEE [10.1109/ICDIM.2007.369236].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/13335
Citazioni
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 4
Social impact