Gaussian Mixture Models (GMMs) are one of the most widespread methodologies for model-based clustering. They assume a multivariate Gaussian distribution for each component of the mixture, centered at the mean vector and with volume, shape and orientation derived by the covariance matrix. To reduce the large number of parameters produced by the covariance matrices, parsimonious parameterizations of the latter were proposed in literature, e.g., the eigen-decomposition and the parsimonious GMMs based on mixtures of probabilistic principal component analyzers and mixtures of factor analyzers. We introduce a new parameterization of a covariance matrix by defining an extended ultrametric covariance matrix and we implement it into a GMM. This structure can be used to describe multidimensional phenomena which are characterized by nested latent concepts having different levels of abstraction, from the most specific to the most general. The proposal is able to pinpoint a hierarchical structure on variables for each component of the GMM, thus identifying a different characterization of a multidimensional phenomenon for each component (cluster, subpopulation) of the mixture. At the same time, it defines a new parsimonious GMM since the ultrametric covariance structure reconstructs the relationships among variables with a limited number of parameters. The proposal is applied on synthetic and real data. On the former it shows good performance in terms of classification when compared to the other existing parameterizations, and on the latter it also provides insight into the hierarchical relationships among the variables for each cluster.

Cavicchia, C., Vichi, M., Zaccaria, G. (2022). Gaussian mixture model with an extended ultrametric covariance structure. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 16(2), 399-427 [10.1007/s11634-021-00488-x].

Gaussian mixture model with an extended ultrametric covariance structure

GIorgia Zaccaria
2022

Abstract

Gaussian Mixture Models (GMMs) are one of the most widespread methodologies for model-based clustering. They assume a multivariate Gaussian distribution for each component of the mixture, centered at the mean vector and with volume, shape and orientation derived by the covariance matrix. To reduce the large number of parameters produced by the covariance matrices, parsimonious parameterizations of the latter were proposed in literature, e.g., the eigen-decomposition and the parsimonious GMMs based on mixtures of probabilistic principal component analyzers and mixtures of factor analyzers. We introduce a new parameterization of a covariance matrix by defining an extended ultrametric covariance matrix and we implement it into a GMM. This structure can be used to describe multidimensional phenomena which are characterized by nested latent concepts having different levels of abstraction, from the most specific to the most general. The proposal is able to pinpoint a hierarchical structure on variables for each component of the GMM, thus identifying a different characterization of a multidimensional phenomenon for each component (cluster, subpopulation) of the mixture. At the same time, it defines a new parsimonious GMM since the ultrametric covariance structure reconstructs the relationships among variables with a limited number of parameters. The proposal is applied on synthetic and real data. On the former it shows good performance in terms of classification when compared to the other existing parameterizations, and on the latter it also provides insight into the hierarchical relationships among the variables for each cluster.
Articolo in rivista - Articolo scientifico
Cluster analysis; Hierarchical models; Parsimonious models; Ultrametric matrices;
English
25-feb-2022
2022
16
2
399
427
reserved
Cavicchia, C., Vichi, M., Zaccaria, G. (2022). Gaussian mixture model with an extended ultrametric covariance structure. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 16(2), 399-427 [10.1007/s11634-021-00488-x].
File in questo prodotto:
File Dimensione Formato  
Cavicchia-2022-Adv Data Anal Classif-VoR.pdf

Solo gestori archivio

Descrizione: Regular Article
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 1.18 MB
Formato Adobe PDF
1.18 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/394314
Citazioni
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
Social impact