Outlier detection in mixture models where there may be multiple outliers contaminating the main data is considered. Methods in the literature for continuous data have grown and now include fitting mixtures of t-distributions rather then normal or robustifying through modifying the likelihood in various ways. The focus in contrast is on discrete data and considers the approach of obtaining better estimates for the main underlying structure in latent class models in large data sets by detecting them and fitting them as separate mixture and latent class components. We propose a method that uses latent class outliers as seeds for new groups, and we propose a new algorithm for fitting latent class models where some of the latent classes are small. We show that often such latent class solutions are better than those obtained from random start points.
Francis, B., Pennoni, F., Pandolfi, S., Bartolucci, F. (2014). Robust latent class analysis through outlier detection and modelling. In Book of Abstract 7th International Conference of the ERCIM Working Group on Computational and Methodological Statistics.
Robust latent class analysis through outlier detection and modelling
PENNONI, FULVIA;
2014
Abstract
Outlier detection in mixture models where there may be multiple outliers contaminating the main data is considered. Methods in the literature for continuous data have grown and now include fitting mixtures of t-distributions rather then normal or robustifying through modifying the likelihood in various ways. The focus in contrast is on discrete data and considers the approach of obtaining better estimates for the main underlying structure in latent class models in large data sets by detecting them and fitting them as separate mixture and latent class components. We propose a method that uses latent class outliers as seeds for new groups, and we propose a new algorithm for fitting latent class models where some of the latent classes are small. We show that often such latent class solutions are better than those obtained from random start points.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.