Two methods based on the Forward Imputation (ForImp) approach are implemented for the imputation of quantitative missing data. One method alternates the Nearest Neighbour Imputation (NNI) method and Principal Component Analysis (function ForImp.PCA), the other uses NNI with the Mahalanobis distance (function ForImp.Mahala). ForImp is a sequential distance-based approach that performs imputation of missing data in a forward, step-by-step process involving subsets of units according to their “completeness rate”. During the iterative process, the complete part of data is updated thus becoming larger and larger. No initialization of missing entries is required. ForImp is inherent in the nonparametric and exploratory-descriptive framework since it does not require a priori distribution assumptions on data. Two supplementary functions (missing.gen and missing.gen0) are also provided to generate Missing Completely At Random (MCAR) values on a data matrix.
Solaro, N., Barbiero, A., Manzi, G., Ferrari, P. (2015). GenForImp: The Forward Imputation: A Sequential Distance-Based Approach for Imputing Missing Data. R package version 1.0. [Software].
GenForImp: The Forward Imputation: A Sequential Distance-Based Approach for Imputing Missing Data. R package version 1.0.
SOLARO, NADIAPrimo
;
2015
Abstract
Two methods based on the Forward Imputation (ForImp) approach are implemented for the imputation of quantitative missing data. One method alternates the Nearest Neighbour Imputation (NNI) method and Principal Component Analysis (function ForImp.PCA), the other uses NNI with the Mahalanobis distance (function ForImp.Mahala). ForImp is a sequential distance-based approach that performs imputation of missing data in a forward, step-by-step process involving subsets of units according to their “completeness rate”. During the iterative process, the complete part of data is updated thus becoming larger and larger. No initialization of missing entries is required. ForImp is inherent in the nonparametric and exploratory-descriptive framework since it does not require a priori distribution assumptions on data. Two supplementary functions (missing.gen and missing.gen0) are also provided to generate Missing Completely At Random (MCAR) values on a data matrix.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.