Motivated by an application to a longitudinal data set coming from the Health and Retirement Study about self-reported health status, we propose a model for longitudinal data which is based on a latent process to account for the unobserved heterogeneity between sample units in a dynamic fashion. The latent process is modelled by a mixture of auto-regressive AR(1) processes with different means and correlation coefficients, but with equal variances. We show how to perform maximum likelihood estimation of the proposed model by the joint use of an expectation–maximization algorithm and a Newton–Raphson algorithm, implemented by means of recursions developed in the hidden Markov model literature. We also introduce a simple method to obtain standard errors for the parameter estimates and suggest a strategy to choose the number of mixture components. In the application the response variable is ordinal; however, the approach may also be applied in other settings. Moreover, the application to the self-reported health status data set allows us to show that the model proposed is more flexible than other models for longitudinal data based on a continuous latent process. The model also achieves a goodness of fit that is similar to that of models based on a discrete latent process following a Markov chain, while retaining a reduced number of parameters. The effect of different formulations of the latent structure of the model is evaluated in terms of estimates of the regression parameters for the covariates.
Bartolucci, F., Bacci, S., Pennoni, F. (2014). Longitudinal analysis of self-reported health status by mixture latent auto-regressive models. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 63, 267-288 [10.1111/rssc.12030].
Longitudinal analysis of self-reported health status by mixture latent auto-regressive models
PENNONI, FULVIA
2014
Abstract
Motivated by an application to a longitudinal data set coming from the Health and Retirement Study about self-reported health status, we propose a model for longitudinal data which is based on a latent process to account for the unobserved heterogeneity between sample units in a dynamic fashion. The latent process is modelled by a mixture of auto-regressive AR(1) processes with different means and correlation coefficients, but with equal variances. We show how to perform maximum likelihood estimation of the proposed model by the joint use of an expectation–maximization algorithm and a Newton–Raphson algorithm, implemented by means of recursions developed in the hidden Markov model literature. We also introduce a simple method to obtain standard errors for the parameter estimates and suggest a strategy to choose the number of mixture components. In the application the response variable is ordinal; however, the approach may also be applied in other settings. Moreover, the application to the self-reported health status data set allows us to show that the model proposed is more flexible than other models for longitudinal data based on a continuous latent process. The model also achieves a goodness of fit that is similar to that of models based on a discrete latent process following a Markov chain, while retaining a reduced number of parameters. The effect of different formulations of the latent structure of the model is evaluated in terms of estimates of the regression parameters for the covariates.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.