A global cross-discipline effort is ongoing to characterize the evolution of SARS-CoV-2 virus and generate reliable epidemiological models of its diffusion. To this end, phylogenomic approaches leverage accumulating genomic mutations to track the evolutionary history of the virus and benefit from the surge of sequences deposited in public databases. Yet, such methods typically rely on consensus sequences representing the dominant virus lineage, whereas a complex intra-host genomic composition is often observed within single hosts. Furthermore, most approaches might produce inaccurate results with noisy data and sampling limitations, as witnessed in most countries affected by the epidemics. We introduce VERSO (Viral Evolution ReconStructiOn), a new comprehensive framework for the characterization of viral evolution and transmission from sequencing data of viral genomes. Our probabilistic approach first delivers robust phylogenetic models from clonal variant profiles and then exploits variant frequency patterns to characterize and visualize the intra-host genomic diversity of samples, which may reveal uncovered infection events. We prove via extensive simulations that VERSO outperforms the state-of-the-art tools for phylogenetic inference, also in condition of noisy observations and sampling limitations. The application of our approach to 3960 SARS-CoV-2 samples from Amplicon sequencing and to 2766 samples from RNA-sequencing unravels robust phylogenomic models, improving the current knowledge on SARS-CoV-2 evolution and spread. Importantly, by exploiting co-occurrence patterns of minor variants, VERSO allows us to reveal uncovered infection paths, which are validated with contact tracing data. Moreover, the in-depth analysis of the mutational landscape of SARS-CoV-2 confirms a statistically significant increase of genomic diversity in time and allows us to identify a number of variants that are transiting from minor to clonal state in the population, as well as several homoplasies, some of which might indicate ongoing positive selection processes. Overall,the results show that the joint application of our framework and data-driven epidemiological models might improve currently available strategies for pathogen surveillance and analysis. VERSO is released as an open source tool at https://github.com/BIMIB-DISCo/VERSO.

Ramazzotti, D., Angaroni, F., Maspero, D., Gambacorti-Passerini, C., Antoniotti, M., Graudenzi, A., et al. (2020). VERSO: a comprehensive framework for the inference of robust phylogenies and the quantification of intra-host genomic diversity of viral samples [Altro] [10.1101/2020.04.22.044404].

VERSO: a comprehensive framework for the inference of robust phylogenies and the quantification of intra-host genomic diversity of viral samples

Ramazzotti, Daniele;Angaroni, Fabrizio;Maspero, Davide;Gambacorti-Passerini, Carlo;Antoniotti, Marco;Graudenzi, Alex;Piazza, Rocco
2020

Abstract

A global cross-discipline effort is ongoing to characterize the evolution of SARS-CoV-2 virus and generate reliable epidemiological models of its diffusion. To this end, phylogenomic approaches leverage accumulating genomic mutations to track the evolutionary history of the virus and benefit from the surge of sequences deposited in public databases. Yet, such methods typically rely on consensus sequences representing the dominant virus lineage, whereas a complex intra-host genomic composition is often observed within single hosts. Furthermore, most approaches might produce inaccurate results with noisy data and sampling limitations, as witnessed in most countries affected by the epidemics. We introduce VERSO (Viral Evolution ReconStructiOn), a new comprehensive framework for the characterization of viral evolution and transmission from sequencing data of viral genomes. Our probabilistic approach first delivers robust phylogenetic models from clonal variant profiles and then exploits variant frequency patterns to characterize and visualize the intra-host genomic diversity of samples, which may reveal uncovered infection events. We prove via extensive simulations that VERSO outperforms the state-of-the-art tools for phylogenetic inference, also in condition of noisy observations and sampling limitations. The application of our approach to 3960 SARS-CoV-2 samples from Amplicon sequencing and to 2766 samples from RNA-sequencing unravels robust phylogenomic models, improving the current knowledge on SARS-CoV-2 evolution and spread. Importantly, by exploiting co-occurrence patterns of minor variants, VERSO allows us to reveal uncovered infection paths, which are validated with contact tracing data. Moreover, the in-depth analysis of the mutational landscape of SARS-CoV-2 confirms a statistically significant increase of genomic diversity in time and allows us to identify a number of variants that are transiting from minor to clonal state in the population, as well as several homoplasies, some of which might indicate ongoing positive selection processes. Overall,the results show that the joint application of our framework and data-driven epidemiological models might improve currently available strategies for pathogen surveillance and analysis. VERSO is released as an open source tool at https://github.com/BIMIB-DISCo/VERSO.
Altro
Preprint
Phylogeny;Viral Sampling;SARS-CoV-2
English
2020
Ramazzotti, D., Angaroni, F., Maspero, D., Gambacorti-Passerini, C., Antoniotti, M., Graudenzi, A., et al. (2020). VERSO: a comprehensive framework for the inference of robust phylogenies and the quantification of intra-host genomic diversity of viral samples [Altro] [10.1101/2020.04.22.044404].
none
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/288883
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact