The Dirichlet is the most well known distribution for compositional data, i.e. data representing vectors of proportions. The flexible Dirichlet distribution (FD) generalizes the Dirichlet one allowing to preserve its main mathematical and compositional properties. At the same time, it does not inherit its lack of flexibility in modeling the dependence concepts appropriate for compositional data. The present paper introduces a new model obtained by extending the basis of positive random variables generating the FD by normalization. Specifically, the new basis exhibits a more sophisticated mixture (latent) representation, which leads to a twofold result. On the one side, a more general distribution for compositional data, called EFD, is obtained by normalization. In particular, the EFD allows for a significantly wider differentiation among the clusters defining its mixture representation. On the other side, the generalized basis induces a tractable model for the dependence between composition and size: the conditional distribution of the composition given the size is still an EFD, the size affecting it in a simple fashion through the cluster weights.

Ongaro, A., Migliorati, S. (2014). A Dirichlet mixture model for compositions allowing for dependence on the size. In M. Carpita, E. Brentari, E.M. Qannari (a cura di), Advances in Latent Variables Methods, Models and Applications. Springer.

A Dirichlet mixture model for compositions allowing for dependence on the size

ONGARO, ANDREA;MIGLIORATI, SONIA
Secondo
2014

Abstract

The Dirichlet is the most well known distribution for compositional data, i.e. data representing vectors of proportions. The flexible Dirichlet distribution (FD) generalizes the Dirichlet one allowing to preserve its main mathematical and compositional properties. At the same time, it does not inherit its lack of flexibility in modeling the dependence concepts appropriate for compositional data. The present paper introduces a new model obtained by extending the basis of positive random variables generating the FD by normalization. Specifically, the new basis exhibits a more sophisticated mixture (latent) representation, which leads to a twofold result. On the one side, a more general distribution for compositional data, called EFD, is obtained by normalization. In particular, the EFD allows for a significantly wider differentiation among the clusters defining its mixture representation. On the other side, the generalized basis induces a tractable model for the dependence between composition and size: the conditional distribution of the composition given the size is still an EFD, the size affecting it in a simple fashion through the cluster weights.
Capitolo o saggio
Dirichlet mixture, clusters, compositional invariance, basis size
English
Advances in Latent Variables Methods, Models and Applications
978-3-319-02966-5
Ongaro, A., Migliorati, S. (2014). A Dirichlet mixture model for compositions allowing for dependence on the size. In M. Carpita, E. Brentari, E.M. Qannari (a cura di), Advances in Latent Variables Methods, Models and Applications. Springer.
Ongaro, A; Migliorati, S
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/55520
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact