Controlled biomolecular annotations are key concepts in computational genomics and proteomics, since they can describe the functional features of genes and their products in both a simple and computational way. Despite the importance of these annotations, many of them are missing, and the available ones contain errors and inconsistencies; furthermore, the discovery and validation of new annotations are very time-consuming tasks. For these reasons, recently many computer scientists developed several machine-learning algorithms able to computationally predict new gene-function relationships. While several of these methods have been easily adapted from different domains to bioinformatics, their validation remains a challenging aspect of a computational pipeline. Here, we propose a validation procedure based upon three different sub-phases, which is able to assess the precision of any algorithm predictions with a reliable degree of accuracy. We show some validation results obtained for Gene Ontology annotations of Homo sapiens genes that demonstrate the effectiveness of our validation approach.
Chicco, D., Masseroli, M. (2016). Validation pipeline for computational prediction of genomics annotation. In C. Angelini, P. Rancoita, S. Rovetta (a cura di), Computational Intelligence Methods for Bioinformatics and Biostatistics. 12th International Meeting, CIBB 2015, Naples, Italy, September 10-12, 2015, Revised Selected Papers (pp. 233-244). Springer Verlag [10.1007/978-3-319-44332-4_18].
Validation pipeline for computational prediction of genomics annotation
Chicco, D
Primo
;
2016
Abstract
Controlled biomolecular annotations are key concepts in computational genomics and proteomics, since they can describe the functional features of genes and their products in both a simple and computational way. Despite the importance of these annotations, many of them are missing, and the available ones contain errors and inconsistencies; furthermore, the discovery and validation of new annotations are very time-consuming tasks. For these reasons, recently many computer scientists developed several machine-learning algorithms able to computationally predict new gene-function relationships. While several of these methods have been easily adapted from different domains to bioinformatics, their validation remains a challenging aspect of a computational pipeline. Here, we propose a validation procedure based upon three different sub-phases, which is able to assess the precision of any algorithm predictions with a reliable degree of accuracy. We show some validation results obtained for Gene Ontology annotations of Homo sapiens genes that demonstrate the effectiveness of our validation approach.File | Dimensione | Formato | |
---|---|---|---|
Chicco-2016-CIBB-VoR.pdf
Solo gestori archivio
Descrizione: Contributo in libro
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Tutti i diritti riservati
Dimensione
343.99 kB
Formato
Adobe PDF
|
343.99 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.