Bicocca Open Archive

Background: The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. Introduction: The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cancer development and deliver effective diagnostic and prognostic strategies. Methods: We explore the possibility of exploiting the topological properties of sample-specific metabolic networks as features in a supervised classification task. Such networks are obtained by pro-jecting transcriptomic data from RNA-seq experiments on genome-wide metabolic models to define weighted networks modeling the overall metabolic activity of a given sample. Results: We show the classification results on a labeled breast cancer dataset from the TCGA data-base, including 210 samples (cancer vs. normal). In particular, we investigate how the performance is affected by a threshold-based pruning of the networks by comparing Artificial Neural Networks, Support Vector Machines and Random Forests. Interestingly, the best classification performance is achieved within a small threshold range for all methods, suggesting that it might represent an effective choice to recover useful information while filtering out noise from data. Overall, the best accuracy is achieved with SVMs, which exhibit performances similar to those obtained when gene expression profiles are used as features. Conclusion: These findings demonstrate that the topological properties of sample-specific metabolic networks are effective in classifying cancer and normal samples, suggesting that useful information can be extracted from a relatively limited number of features.

Machicao, J., Craighero, F., Maspero, D., Angaroni, F., Damiani, C., Graudenzi, A., et al. (2021). On the use of topological features of metabolic networks for the classification of cancer samples. CURRENT GENOMICS, 22(2), 88-97 [10.2174/1389202922666210301084151].

On the use of topological features of metabolic networks for the classification of cancer samples

Craighero, Francesco;Maspero, Davide;Angaroni, Fabrizio;Damiani, Chiara;Graudenzi, Alex;Antoniotti, Marco^Co-ultimo;

2021

Abstract

Background: The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. Introduction: The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cancer development and deliver effective diagnostic and prognostic strategies. Methods: We explore the possibility of exploiting the topological properties of sample-specific metabolic networks as features in a supervised classification task. Such networks are obtained by pro-jecting transcriptomic data from RNA-seq experiments on genome-wide metabolic models to define weighted networks modeling the overall metabolic activity of a given sample. Results: We show the classification results on a labeled breast cancer dataset from the TCGA data-base, including 210 samples (cancer vs. normal). In particular, we investigate how the performance is affected by a threshold-based pruning of the networks by comparing Artificial Neural Networks, Support Vector Machines and Random Forests. Interestingly, the best classification performance is achieved within a small threshold range for all methods, suggesting that it might represent an effective choice to recover useful information while filtering out noise from data. Overall, the best accuracy is achieved with SVMs, which exhibit performances similar to those obtained when gene expression profiles are used as features. Conclusion: These findings demonstrate that the topological properties of sample-specific metabolic networks are effective in classifying cancer and normal samples, suggesting that useful information can be extracted from a relatively limited number of features.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Cancer sample classification; Machine learning; Metabolic networks; Network pruning; RNA-seq data; Topological properties;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				28-feb-2021
			
	Data di pubblicazione
	
				2021
			
	Rivista
	
				CURRENT GENOMICS
			
	Numero del volume
	
				22
			
	Fascicolo
	
				2
			
	Pagina iniziale
	
				88
			
	Pagina finale
	
				97
			
	DOI dell'articolo
	
				https://dx.doi.org/10.2174/1389202922666210301084151
			
	Fulltext
	
				reserved
			
	Citazione
	
				Machicao, J., Craighero, F., Maspero, D., Angaroni, F., Damiani, C., Graudenzi, A., et al. (2021). On the use of topological features of metabolic networks for the classification of cancer samples. CURRENT GENOMICS, 22(2), 88-97 [10.2174/1389202922666210301084151].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Fabrizio.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Dimensione 1.64 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.64 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/314097

Citazioni

6

5

Social impact