One of the critical steps to characterize metabolic alterations in multifactorial diseases, as well as their heterogeneity across different patients, is the identification of reactions that exhibit significantly different usage (or flux) between cohorts. However, since metabolic fluxes cannot be determined directly, researchers typically use constraint-based metabolic network models, customized on post-genomics datasets. The use of random sampling within the feasible region of metabolic networks is becoming more prevalent for comparing these networks. While many algorithms have been proposed and compared for efficiently and uniformly sampling the feasible region of metabolic networks, their impact on the risk of making false discoveries when comparing different samples has not been investigated yet, and no sampling strategy has been so far specifically designed to mitigate the problem. To be able to precisely assess the False Discovery Rate (FDR), in this work we compared different samples obtained from the very same metabolic model. We compared the FDR obtained for different model scales, sample sizes, parameters of the sampling algorithm, and strategies to filter out non-significant variations. To be able to compare the largely used hit-and-run strategy with the much less investigated corner-based strategy, we first assessed the intrinsic capability of current corner-based algorithms and of a newly proposed one to visit all vertices of a constraint-based region. We show that false discoveries can occur at high rates even for large samples of small-scale networks. However, we demonstrate that a statistical test based on the empirical null distribution of Kullback–Leibler divergence can effectively correct for false discoveries. We also show that our proposed corner-based algorithm is more efficient than state-of-the-art alternatives and much less prone to false discoveries than hit-and-run strategies. We report that the differences in the marginal distributions obtained with the two strategies are related to but not fully explained by differences in sample standard deviation, as previously thought. Overall, our study provides insights into the impact of sampling strategies on FDR in metabolic network analysis and offers new guidelines for more robust and reproducible analyses.

Galuzzi, B., Milazzo, L., Damiani, C. (2024). Adjusting for false discoveries in constraint-based differential metabolic flux analysis. JOURNAL OF BIOMEDICAL INFORMATICS, 150(February 2024) [10.1016/j.jbi.2024.104597].

Adjusting for false discoveries in constraint-based differential metabolic flux analysis

Galuzzi, BG
Primo
;
Milazzo, L;Damiani, C
Ultimo
2024

Abstract

One of the critical steps to characterize metabolic alterations in multifactorial diseases, as well as their heterogeneity across different patients, is the identification of reactions that exhibit significantly different usage (or flux) between cohorts. However, since metabolic fluxes cannot be determined directly, researchers typically use constraint-based metabolic network models, customized on post-genomics datasets. The use of random sampling within the feasible region of metabolic networks is becoming more prevalent for comparing these networks. While many algorithms have been proposed and compared for efficiently and uniformly sampling the feasible region of metabolic networks, their impact on the risk of making false discoveries when comparing different samples has not been investigated yet, and no sampling strategy has been so far specifically designed to mitigate the problem. To be able to precisely assess the False Discovery Rate (FDR), in this work we compared different samples obtained from the very same metabolic model. We compared the FDR obtained for different model scales, sample sizes, parameters of the sampling algorithm, and strategies to filter out non-significant variations. To be able to compare the largely used hit-and-run strategy with the much less investigated corner-based strategy, we first assessed the intrinsic capability of current corner-based algorithms and of a newly proposed one to visit all vertices of a constraint-based region. We show that false discoveries can occur at high rates even for large samples of small-scale networks. However, we demonstrate that a statistical test based on the empirical null distribution of Kullback–Leibler divergence can effectively correct for false discoveries. We also show that our proposed corner-based algorithm is more efficient than state-of-the-art alternatives and much less prone to false discoveries than hit-and-run strategies. We report that the differences in the marginal distributions obtained with the two strategies are related to but not fully explained by differences in sample standard deviation, as previously thought. Overall, our study provides insights into the impact of sampling strategies on FDR in metabolic network analysis and offers new guidelines for more robust and reproducible analyses.
Articolo in rivista - Articolo scientifico
COnstraint-based modeling; Corner-based; False discovery rate; Flux sampling; Hit and run;
English
23-gen-2024
2024
150
February 2024
104597
open
Galuzzi, B., Milazzo, L., Damiani, C. (2024). Adjusting for false discoveries in constraint-based differential metabolic flux analysis. JOURNAL OF BIOMEDICAL INFORMATICS, 150(February 2024) [10.1016/j.jbi.2024.104597].
File in questo prodotto:
File Dimensione Formato  
Galuzzi-2024-J Biomed Informatics-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 2.59 MB
Formato Adobe PDF
2.59 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/473924
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
Social impact