Chimeric RNAs are transcripts generated by gene fusion and intergenic splicing events, thus comprising nucleotide sequences from different genes. Recent studies have shown that some chimeric RNAs can play a role in cancer development, and so can be used as diagnostics biomarkers when specifically expressed in cancerous cells and tissues. Most gene fusion prediction tools rely on an initial alignment step. However, alignments might be biased, especially for chimeric reads, creating many false positives. Therefore, developing alignment-free prediction methods of fusion genes would be helpful and may provide new insights into the genomic breakage phenomenon in the cell. In this direction, machine learning could pave the way for new solutions, due to their success in predicting genomic regulatory elements and alternative junction events from the genomic context. To date, however, these techniques have had a marginal supporting role, and, furthermore, manually-curated data sets, that are crucial for model training, are often expensive, unreliable or simply unavailable. Here we propose a novel ML-based method that learn to recognize the hidden patterns that allow us to identify chimeric RNAs deriving from oncogenic gene fusions. Preliminary comparison with another state-of-the-art method shows promising results.
Bonizzoni, P., De Felice, C., Pirola, Y., Rizzi, R., Zaccagnino, R., Zizza, R. (2025). Identification of Chimeric RNAs: A Novel Machine Learning Perspective. In Computational Advances in Bio and Medical Sciences 12th International Conference, ICCABS 2023, Norman, OK, USA, December 11–13, 2023, Revised Selected Papers (pp.14-26). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-82768-6_2].
Identification of Chimeric RNAs: A Novel Machine Learning Perspective
Bonizzoni P.;Pirola Y.;Rizzi R.;
2025
Abstract
Chimeric RNAs are transcripts generated by gene fusion and intergenic splicing events, thus comprising nucleotide sequences from different genes. Recent studies have shown that some chimeric RNAs can play a role in cancer development, and so can be used as diagnostics biomarkers when specifically expressed in cancerous cells and tissues. Most gene fusion prediction tools rely on an initial alignment step. However, alignments might be biased, especially for chimeric reads, creating many false positives. Therefore, developing alignment-free prediction methods of fusion genes would be helpful and may provide new insights into the genomic breakage phenomenon in the cell. In this direction, machine learning could pave the way for new solutions, due to their success in predicting genomic regulatory elements and alternative junction events from the genomic context. To date, however, these techniques have had a marginal supporting role, and, furthermore, manually-curated data sets, that are crucial for model training, are often expensive, unreliable or simply unavailable. Here we propose a novel ML-based method that learn to recognize the hidden patterns that allow us to identify chimeric RNAs deriving from oncogenic gene fusions. Preliminary comparison with another state-of-the-art method shows promising results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


