This paper is meant to introduce a significant extension of the flexible Dirichlet (FD) distribution, which is a quite tractable special mixture model for compositional data, i.e. data representing vectors of proportions of a whole. The FD model displays several theoretical properties which make it suitable for inference, and fairly easy to handle from a computational viewpoint. However, the rigid type of mixture structure implied by the FD makes it unsuitable to describe many compositional datasets. Furthermore, the FD only allows for negative correlations. The new extended model, by considerably relaxing the strict constraints among clusters entailed by the FD, allows for a more general dependence structure (including positive correlations) and greatly expands its applicative potential. At the same time, it retains, to a large extent, its good properties. EM-type estimation procedures can be developed for this more complex model, including ad hoc reliable initialization methods, which permit to keep the computational issues at a rather uncomplicated level. Accurate evaluation of standard error estimates can be provided as well.

Ongaro, A., Migliorati, S., Ascari, R. (2020). A new mixture model on the simplex. STATISTICS AND COMPUTING, 30(4), 749-770 [10.1007/s11222-019-09920-x].

A new mixture model on the simplex

Ongaro, Andrea
Primo
;
Migliorati, Sonia
Secondo
;
Ascari, Roberto
Ultimo
2020

Abstract

This paper is meant to introduce a significant extension of the flexible Dirichlet (FD) distribution, which is a quite tractable special mixture model for compositional data, i.e. data representing vectors of proportions of a whole. The FD model displays several theoretical properties which make it suitable for inference, and fairly easy to handle from a computational viewpoint. However, the rigid type of mixture structure implied by the FD makes it unsuitable to describe many compositional datasets. Furthermore, the FD only allows for negative correlations. The new extended model, by considerably relaxing the strict constraints among clusters entailed by the FD, allows for a more general dependence structure (including positive correlations) and greatly expands its applicative potential. At the same time, it retains, to a large extent, its good properties. EM-type estimation procedures can be developed for this more complex model, including ad hoc reliable initialization methods, which permit to keep the computational issues at a rather uncomplicated level. Accurate evaluation of standard error estimates can be provided as well.
Articolo in rivista - Articolo scientifico
Compositional invariance; Dirichlet mixture; EM-type algorithms; Multi-modality; Proportion;
English
10-gen-2020
2020
30
4
749
770
reserved
Ongaro, A., Migliorati, S., Ascari, R. (2020). A new mixture model on the simplex. STATISTICS AND COMPUTING, 30(4), 749-770 [10.1007/s11222-019-09920-x].
File in questo prodotto:
File Dimensione Formato  
Ongaro-2020-StatComp-VoR.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 2.43 MB
Formato Adobe PDF
2.43 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/257218
Citazioni
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 5
Social impact