Motivated by the challenges in analyzing gut microbiome and metagenomic data, this paper introduces a novel mixture distribution for multivariate counts and a regression model built upon it. The flexibility and interpretability of the proposed distribution accommodate both negative and positive dependence among taxa and are accompanied by numerous theoretical properties, including explicit expressions for inter- and intraclass correlations, thereby providing a powerful tool for understanding complex microbiome interactions. Furthermore, the regression model based on this distribution facilitates the clear identification and interpretation of relationships between taxa and covariates by modeling the marginal mean of the multivariate response (i.e., taxa counts). Inference is performed using a tailored Hamiltonian Monte Carlo estimation method combined with a spike-and-slab variable selection procedure. Extensive simulation studies and an application to a human gut microbiome dataset highlight the proposed model's substantial improvements over competing models in terms of fit, interpretability, and predictive performance.

Ascari, R., Migliorati, S., Ongaro, A. (2025). A New Dirichlet-Multinomial Mixture Regression Model for the Analysis of Microbiome Data. STATISTICS IN MEDICINE, 44(18-19 (August 2025)) [10.1002/sim.70220].

A New Dirichlet-Multinomial Mixture Regression Model for the Analysis of Microbiome Data

Ascari, Roberto
;
Migliorati, Sonia;Ongaro, Andrea
2025

Abstract

Motivated by the challenges in analyzing gut microbiome and metagenomic data, this paper introduces a novel mixture distribution for multivariate counts and a regression model built upon it. The flexibility and interpretability of the proposed distribution accommodate both negative and positive dependence among taxa and are accompanied by numerous theoretical properties, including explicit expressions for inter- and intraclass correlations, thereby providing a powerful tool for understanding complex microbiome interactions. Furthermore, the regression model based on this distribution facilitates the clear identification and interpretation of relationships between taxa and covariates by modeling the marginal mean of the multivariate response (i.e., taxa counts). Inference is performed using a tailored Hamiltonian Monte Carlo estimation method combined with a spike-and-slab variable selection procedure. Extensive simulation studies and an application to a human gut microbiome dataset highlight the proposed model's substantial improvements over competing models in terms of fit, interpretability, and predictive performance.
Articolo in rivista - Articolo scientifico
compound distribution; inter- and intraclass correlations; multivariate counts; spike and slab prior;
English
7-ago-2025
2025
44
18-19 (August 2025)
e70220
open
Ascari, R., Migliorati, S., Ongaro, A. (2025). A New Dirichlet-Multinomial Mixture Regression Model for the Analysis of Microbiome Data. STATISTICS IN MEDICINE, 44(18-19 (August 2025)) [10.1002/sim.70220].
File in questo prodotto:
File Dimensione Formato  
Ascari et al-2025-Statistics in Medicine-VoR.pdf

accesso aperto

Descrizione: This is an open access article under the terms of the Creative Commons Attribution License
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 552.46 kB
Formato Adobe PDF
552.46 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/563981
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
Social impact