A latent Markov model from a new perspective with an application

Bartolucci, F; Pennoni, F; Vittadini, G

We propose the use of the latent Markov model in a context of the estimation of multiple causal effects when dealing with observational studies and there are unobserved baseline differences between individuals. The proposed model, tailored for longitudinal data analysis in its basic formulation, has been first introduced by Wiggins (1951) and then formalized in his Ph.D thesis, Wiggins (1955). In Bartolucci et al. (2013) several extensions of the first basic formulation are given and new models have been proposed. The fact that the assumptions encoded by the model may be represented with the help of a path diagram contributes to make such class of models a powerful tool for the analysis of statistical data. In fact, as stated in Pennoni (2014), such models may be seen as built on the foundation of graphical causal models first proposed by Wright (1921) in genetics. Many statistical models tailored for the estimation of the causal effects have been proposed from that period. The potential outcome framework resulted to be one of the most useful tool. However, in a longitudinal setting the latter it is not still well developed as well as for some powerful models developed in the econometric context, see also Romeo (2014). Building on the foundation of the above models and on a recent proposal of Lanza et al. (2013), we introduce a new use of the propensity score weighting (Rosenbaum and Rubin, 1983) when dealing with a multivariate responses observed at multiple time occasions. We show some assumptions which have to be sustainable for the use of the proposed approach in the context of study. The use of the latent Markov model helps to get a reliable estimate of the average causal effect. An interesting feature of the proposed approach is its flexibility given by the adopted parameterization which allows us to deal with any kind of response variable. The model is fitted by a maximum likelihood estimation procedure based on first estimating a multinomial logit model for the probability of taking each type of treatment given suitably chosen pretreatment covariates. Then, a weighted log-likelihood of the LM model, with weights computed on the basis of the estimates computed at the previous step, is maximized so as to obtain final parameter estimates. This second step relies on the EM algorithm (Baum et al., 1970; Dempster et al., 1977) and reliable standard errors for the model parameters are obtained by using a nonparametric bootstrap method (Davison and Hinkley, 1997).The proposed application is particularly suitable to show the model formulation as that it concerns the evaluation of human capital development which is related to a critical period of university-to-work transition during the first years of the economic crisis. The human capital as stated in Harpan and Draghici (2014) is defined by generic knowledge and skill gained by working experiences and education. By applying the proposed model to a regionally representative longitudinal dataset regarding graduated cohort in the labour market, we show some results which are useful for policy makers such as how the attaining different types of academic degree affects the work path.

Bartolucci, F., Pennoni, F., Vittadini, G. (2015). A latent Markov model from a new perspective with an application. In FIRB (“Futuro in ricerca” 2012). firenze.