The Dirichlet is the most well known distribution for data representing vectors of proportions (i.e. compositions). The flexible Dirichlet distribution (FD, Ongaro, Migliorati (2013)) generalizes the Dirichlet one allowing to preserve many of its mathematical and compositional properties without inheriting its lack of flexibility in modeling the data dependence structure. The Extended FD model (EFD, Ongaro, Migliorati (2014)) is obtained by generalizing the basis of positive random variables generating the FD by normalization. The new model exhibits a more sophisticated mixture representation and it accommodates for the existence of dependence between composition (the normalized basis) and size (the sum of the variables forming the basis). The present paper investigates the inferential aspects of the EFD, as a model for compositional data. In particular, an analysis of the flexibility of the cluster structure and of the dependence pattern implied by the model reveals its relevance for applications. Furthermore, appropriate estimation procedures are devised based on E--M algorithm. Specifically, an ad hoc initialization strategy is proposed to address the crucial choice of starting values for the E--M. Finally, the potential of the model is illustrated by means of applications to real data sets

Migliorati, S., Ongaro, A. (2015). Inferential issues in the Extended Flexible Dirichlet model. In ASMDA 2015 Proceedings (pp.665-678). Skiadas, CH. Published by ISAST: international Society for the Advancement of Science and Technology.

Inferential issues in the Extended Flexible Dirichlet model

MIGLIORATI, SONIA
Primo
;
ONGARO, ANDREA
Secondo
2015

Abstract

The Dirichlet is the most well known distribution for data representing vectors of proportions (i.e. compositions). The flexible Dirichlet distribution (FD, Ongaro, Migliorati (2013)) generalizes the Dirichlet one allowing to preserve many of its mathematical and compositional properties without inheriting its lack of flexibility in modeling the data dependence structure. The Extended FD model (EFD, Ongaro, Migliorati (2014)) is obtained by generalizing the basis of positive random variables generating the FD by normalization. The new model exhibits a more sophisticated mixture representation and it accommodates for the existence of dependence between composition (the normalized basis) and size (the sum of the variables forming the basis). The present paper investigates the inferential aspects of the EFD, as a model for compositional data. In particular, an analysis of the flexibility of the cluster structure and of the dependence pattern implied by the model reveals its relevance for applications. Furthermore, appropriate estimation procedures are devised based on E--M algorithm. Specifically, an ad hoc initialization strategy is proposed to address the crucial choice of starting values for the E--M. Finally, the potential of the model is illustrated by means of applications to real data sets
paper
Dirichlet mixture, E--M algorithm, compositional data, clusters
English
ASMDA (Applied Stochastic Models and Data Analysis International Conference) - 30 June 4 July
2015
Skiadas, CH
ASMDA 2015 Proceedings
978-618-5180-05-8
2015
2015
665
678
www.asmda.es/asmda2015.html
none
Migliorati, S., Ongaro, A. (2015). Inferential issues in the Extended Flexible Dirichlet model. In ASMDA 2015 Proceedings (pp.665-678). Skiadas, CH. Published by ISAST: international Society for the Advancement of Science and Technology.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/94894
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact