Background: Mass spectrometry is an important analytical tool for clinical proteomics. Primarily employed for biomarker discovery, it is increasingly used for developing methods which may help to provide unambiguous diagnosis of biological samples. In this context, we investigated the classif cation of phenotypes by applying support vector machines (SVM) on experimental data obtained by MudPIT approach. In particular, we compared the performance capabilities of SVM by using two independent collection of complex samples and diff erent data types, such as mass spectra (m/z), peptides and proteins. Results: Globally, protein and peptide data allowed a better discriminant informative content than experimental mass spectra (overall accuracy higher than 87% in both collection 1 and 2). These results indicate that sequencing of peptides and proteins reduces the experimental noise a ecting the raw mass spectra, and allows the extraction of more informative features available for the e ective classi cation of samples. In addition, proteins and peptides features selected by SVM matched for 80% with the differentially expressed proteins identifi ed by the MAProMa software. Conclusions: These findings con firm the availability of the most label-free quantitative methods based on processing of spectral count and SEQUEST-based SCORE values. On the other hand, it stresses the usefulness of MudPIT data for a correct grouping of sample phenotypes, by applying both supervised and unsupervised learning algorithms. This capacity permits the evaluation of actual samples and it is a good starting point to translate proteomic methodology to clinical application

Silvestre, D., Zoppis, I., Brambilla, F., Bellettato, V., Mauri, G., Mauri, P. (2013). Availability of MudPIT data for classification of biological samples. JOURNAL OF CLINICAL BIOINFORMATICS, 3(1) [10.1186/2043-9113-3-1].

Availability of MudPIT data for classification of biological samples

ZOPPIS, ITALO FRANCESCO;MAURI, GIANCARLO;
2013

Abstract

Background: Mass spectrometry is an important analytical tool for clinical proteomics. Primarily employed for biomarker discovery, it is increasingly used for developing methods which may help to provide unambiguous diagnosis of biological samples. In this context, we investigated the classif cation of phenotypes by applying support vector machines (SVM) on experimental data obtained by MudPIT approach. In particular, we compared the performance capabilities of SVM by using two independent collection of complex samples and diff erent data types, such as mass spectra (m/z), peptides and proteins. Results: Globally, protein and peptide data allowed a better discriminant informative content than experimental mass spectra (overall accuracy higher than 87% in both collection 1 and 2). These results indicate that sequencing of peptides and proteins reduces the experimental noise a ecting the raw mass spectra, and allows the extraction of more informative features available for the e ective classi cation of samples. In addition, proteins and peptides features selected by SVM matched for 80% with the differentially expressed proteins identifi ed by the MAProMa software. Conclusions: These findings con firm the availability of the most label-free quantitative methods based on processing of spectral count and SEQUEST-based SCORE values. On the other hand, it stresses the usefulness of MudPIT data for a correct grouping of sample phenotypes, by applying both supervised and unsupervised learning algorithms. This capacity permits the evaluation of actual samples and it is a good starting point to translate proteomic methodology to clinical application
Articolo in rivista - Articolo scientifico
Sample Classification, MudPIT, Support Vector Machines, Clinical proteomics, Label-free quantification
English
2013
3
1
1
open
Silvestre, D., Zoppis, I., Brambilla, F., Bellettato, V., Mauri, G., Mauri, P. (2013). Availability of MudPIT data for classification of biological samples. JOURNAL OF CLINICAL BIOINFORMATICS, 3(1) [10.1186/2043-9113-3-1].
File in questo prodotto:
File Dimensione Formato  
109 J Clinical Bioinformatics.pdf

accesso aperto

Dimensione 1.34 MB
Formato Adobe PDF
1.34 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/42736
Citazioni
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
Social impact