Hidden Markov models represent a popular tool for the analysis of longitudinal data, allowing the dynamic clustering of sample units on the basis of a set of repeated responses. In the literature on longitudinal data analysis, these models are typically used in the presence of multivariate categorical data, that is, when more categorical responses are observed at each time occasion. These formulations rely on the assumption of local independence, according to which the responses are conditionally independent given the latent states. Such assumption also simplifies the treatment of missing responses when the missing-at-random assumption is plausible. Here, we deal with the case of continuous multivariate responses in which, as in a Gaussian mixture models, it is natural to assume that the continuous responses for the same time occasion are correlated, according to a specific variance-covariance matrix, even conditionally on the latent states. Although maximum likelihood estimation of this model is straightforward in standard cases using the Expectation-Maximization algorithm, we focus on its estimation when: (i) suitable constraints on the variance-covariance matrix are assumed; (ii) there are missing responses. The constraints we refer to are commonly adopted in the literature of Gaussian finite mixture models. Regarding the assumptions on the generation of missing data we focus on the missing-at-random assumption and we also account for possible individual covariates that may directly affect the responses (in addition to the latent states). In particular, we propose an Expectation Maximization (EM) algorithm that provides exact maximum likelihood estimates and also computes standard errors for the parameter estimates. The proposed approach is illustrated by a simulation study, to evaluate the computational load, and through a real case analysis. We also show how the proposal may be useful in a context of time-series analysis with an application to financial data. An R implementation of the proposed algorithm is made available by the authors within the LMest package.

Pennoni, F., Bartolucci, F., Serafini, A., Pandolfi, S. (2019). Hidden Markov models for continuous multivariate data with missing responses. In Abstract Book 16th Conference of the International Federation of Classification Societies (pp.1-1).

Hidden Markov models for continuous multivariate data with missing responses

Pennoni, F;
2019

Abstract

Hidden Markov models represent a popular tool for the analysis of longitudinal data, allowing the dynamic clustering of sample units on the basis of a set of repeated responses. In the literature on longitudinal data analysis, these models are typically used in the presence of multivariate categorical data, that is, when more categorical responses are observed at each time occasion. These formulations rely on the assumption of local independence, according to which the responses are conditionally independent given the latent states. Such assumption also simplifies the treatment of missing responses when the missing-at-random assumption is plausible. Here, we deal with the case of continuous multivariate responses in which, as in a Gaussian mixture models, it is natural to assume that the continuous responses for the same time occasion are correlated, according to a specific variance-covariance matrix, even conditionally on the latent states. Although maximum likelihood estimation of this model is straightforward in standard cases using the Expectation-Maximization algorithm, we focus on its estimation when: (i) suitable constraints on the variance-covariance matrix are assumed; (ii) there are missing responses. The constraints we refer to are commonly adopted in the literature of Gaussian finite mixture models. Regarding the assumptions on the generation of missing data we focus on the missing-at-random assumption and we also account for possible individual covariates that may directly affect the responses (in addition to the latent states). In particular, we propose an Expectation Maximization (EM) algorithm that provides exact maximum likelihood estimates and also computes standard errors for the parameter estimates. The proposed approach is illustrated by a simulation study, to evaluate the computational load, and through a real case analysis. We also show how the proposal may be useful in a context of time-series analysis with an application to financial data. An R implementation of the proposed algorithm is made available by the authors within the LMest package.
abstract + slide
hierarchical clustering; Expectation-Maximization algorithm; forward-backward recursions; multivariate Gaussian distribution
English
16th Conference of the International Federation of Classification Societies
2019
Theodore Chadjipadelis
Abstract Book 16th Conference of the International Federation of Classification Societies
ago-2019
2019
1
1
https://ifcs.gr/
open
Pennoni, F., Bartolucci, F., Serafini, A., Pandolfi, S. (2019). Hidden Markov models for continuous multivariate data with missing responses. In Abstract Book 16th Conference of the International Federation of Classification Societies (pp.1-1).
File in questo prodotto:
File Dimensione Formato  
Pennoni_IFCS1_2019.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 73.98 kB
Formato Adobe PDF
73.98 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/243548
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact