This talk takes as a starting point outlier detection in mixture models where there may be multiple outliers. Methods in the literature for continuous data have grown and now include fitting mixtures of t-distributions rather than normal distributions or robustifying through modifying the likelihood in various ways. The focus of this work in contrast is on discrete data and considers latent class models with a large number of binary or multinomial indicators. In large data sets and with a large number of indicators, poor identification of the underlying structure can easily occur. Moreover, outliers may not be detectable through classic latent class analysis, and classes will be contaminated. We propose a method that uses latent class outliers as seeds for new groups, and we propose a new algorithm for fitting latent class models where some of the latent classes are small. We show that often such latent class solutions are better than those obtained from random start points. We take the view that outliers are interesting and informative, and need to be used rather than ignored or removed. We illustrate this problem with a latent class analysis of the criminal typologies of female offenders in England and Wales. This extends statistical work on criminal typologies. Methods of detecting outliers in such problems will be presented and discussed, and the benefit of developing more comprehensive approaches to latent class analysis will be emphasised.
Francis, B., Pennoni, F. (2014). A comparison between the latent Markov and growth mixture models for the analysis of longitudinal data. Intervento presentato a: MBC2 Workshop on model based clustering and classification, Catania.
A comparison between the latent Markov and growth mixture models for the analysis of longitudinal data
PENNONI, FULVIA
2014
Abstract
This talk takes as a starting point outlier detection in mixture models where there may be multiple outliers. Methods in the literature for continuous data have grown and now include fitting mixtures of t-distributions rather than normal distributions or robustifying through modifying the likelihood in various ways. The focus of this work in contrast is on discrete data and considers latent class models with a large number of binary or multinomial indicators. In large data sets and with a large number of indicators, poor identification of the underlying structure can easily occur. Moreover, outliers may not be detectable through classic latent class analysis, and classes will be contaminated. We propose a method that uses latent class outliers as seeds for new groups, and we propose a new algorithm for fitting latent class models where some of the latent classes are small. We show that often such latent class solutions are better than those obtained from random start points. We take the view that outliers are interesting and informative, and need to be used rather than ignored or removed. We illustrate this problem with a latent class analysis of the criminal typologies of female offenders in England and Wales. This extends statistical work on criminal typologies. Methods of detecting outliers in such problems will be presented and discussed, and the benefit of developing more comprehensive approaches to latent class analysis will be emphasised.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.