Missing data handling is one of the crucial problems in statistical analyses, and almost always is overcome by imputation. Although the literature is rich in different imputation approaches, the problem of the assessment of the quality of imputation, i.e., appraising whether the imputed values or categories are plausible for variables and units, seems to have received less attention. This issue is critical in every field of application, such as the medical context considered here, i.e., the assessment of cardiovascular disease risks. We faced the problem of comparing the results obtained with different imputation methods and assessing the quality of imputation through the dissimilarity profile analysis (DPA), which is a multivariate exploratory method for the analysis of dissimilarity matrices. We also combined DPA with the traditional profile analysis for data matrices in order to improve understanding of the differentiation components among imputation methods.

Solaro, N. (2019). Evaluating the quality of data imputation in cardiovascular risk studies through the dissimilarity profile analysis. In F. Greselin, L. Deldossi, L. Bagnato, M. Vichi (a cura di), Statistical Learning of Complex Data (pp. 81-92). Springer Berlin Heidelberg [10.1007/978-3-030-21140-0_9].

Evaluating the quality of data imputation in cardiovascular risk studies through the dissimilarity profile analysis

Solaro, N
Primo
2019

Abstract

Missing data handling is one of the crucial problems in statistical analyses, and almost always is overcome by imputation. Although the literature is rich in different imputation approaches, the problem of the assessment of the quality of imputation, i.e., appraising whether the imputed values or categories are plausible for variables and units, seems to have received less attention. This issue is critical in every field of application, such as the medical context considered here, i.e., the assessment of cardiovascular disease risks. We faced the problem of comparing the results obtained with different imputation methods and assessing the quality of imputation through the dissimilarity profile analysis (DPA), which is a multivariate exploratory method for the analysis of dissimilarity matrices. We also combined DPA with the traditional profile analysis for data matrices in order to improve understanding of the differentiation components among imputation methods.
Capitolo o saggio
Euclidean distance; Level; Missing data; Scatter; Shape;
Euclidean distance; Level; Missing data; Scatter; Shape
English
Statistical Learning of Complex Data
Greselin, F; Deldossi, L; Bagnato, L; Vichi, M
2019
9783030211394
Springer Berlin Heidelberg
81
92
Solaro, N. (2019). Evaluating the quality of data imputation in cardiovascular risk studies through the dissimilarity profile analysis. In F. Greselin, L. Deldossi, L. Bagnato, M. Vichi (a cura di), Statistical Learning of Complex Data (pp. 81-92). Springer Berlin Heidelberg [10.1007/978-3-030-21140-0_9].
none
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/261388
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact