The longitudinal data collected by public administrations and large organisations are apt to describe social and economic phenomena, whose dynamics require strong attention from policy makers and civil servants. Unfortunately the quality of the stored data is often very poor, therefore data cleansing is a mandatory step before their exploitation. This paper is driven by the idea that formal methods (specifically model checking) can provide a strong contribution to extracting, formalising, and refining consistency requirements from the domain knowledge, and then verifying the real data against the elicited requirements. We developed a methodology (the Robust Data Quality Analysis) assessing the quality of both the original data and the cleansing results. We applied the proposed approach to a real world scenario in the labour market domain, evaluating the consistency of millions of people careers. The results show that our approach can provide an effective contribution to the improvement of data cleansing activities. Copyright

Boselli, R., Cesarini, M., Mercorio, F., Mezzanzanica, M. (2014). Longitudinal data consistency verification using formal methods. INTERNATIONAL JOURNAL OF INFORMATION QUALITY, 3(3), 185-206 [10.1504/IJIQ.2014.064054].

Longitudinal data consistency verification using formal methods

BOSELLI, ROBERTO;CESARINI, MIRKO;MERCORIO, FABIO;MEZZANZANICA, MARIO
2014

Abstract

The longitudinal data collected by public administrations and large organisations are apt to describe social and economic phenomena, whose dynamics require strong attention from policy makers and civil servants. Unfortunately the quality of the stored data is often very poor, therefore data cleansing is a mandatory step before their exploitation. This paper is driven by the idea that formal methods (specifically model checking) can provide a strong contribution to extracting, formalising, and refining consistency requirements from the domain knowledge, and then verifying the real data against the elicited requirements. We developed a methodology (the Robust Data Quality Analysis) assessing the quality of both the original data and the cleansing results. We applied the proposed approach to a real world scenario in the labour market domain, evaluating the consistency of millions of people careers. The results show that our approach can provide an effective contribution to the improvement of data cleansing activities. Copyright
Articolo in rivista - Articolo scientifico
data quality; model checking; administrative archives; longitudinal data; information quality; consistency verification; labour market
English
5-ago-2014
3
3
185
206
none
Boselli, R., Cesarini, M., Mercorio, F., Mezzanzanica, M. (2014). Longitudinal data consistency verification using formal methods. INTERNATIONAL JOURNAL OF INFORMATION QUALITY, 3(3), 185-206 [10.1504/IJIQ.2014.064054].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/53041
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact