Structural variants (SVs) account for a large amount of sequence variability across genomes and play an important role in human genomics and precision medicine. Despite intense efforts over the years, the discovery of SVs in individuals remains challenging due to the diploid and highly repetitive structure of the human genome, and by the presence of SVs that vastly exceed sequencing read lengths. However, the recent introduction of low-error long-read sequencing technologies such as PacBio HiFi may finally enable these barriers to be overcome. Here we present SV discovery with sample-specific strings (SVDSS)—a method for discovery of SVs from long-read sequencing technologies (for example, PacBio HiFi) that combines and effectively leverages mapping-free, mapping-based and assembly-based methodologies for overall superior SV discovery performance. Our experiments on several human samples show that SVDSS outperforms state-of-the-art mapping-based methods for discovery of insertion and deletion SVs in PacBio HiFi reads and achieves notable improvements in calling SVs in repetitive regions of the genome.

Denti, L., Khorsand, P., Bonizzoni, P., Hormozdiari, F., Chikhi, R. (2023). SVDSS: structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads. NATURE METHODS, 20(4), 550-558 [10.1038/s41592-022-01674-1].

SVDSS: structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads

Luca Denti
Co-primo
;
Paola Bonizzoni
Co-ultimo
;
2023

Abstract

Structural variants (SVs) account for a large amount of sequence variability across genomes and play an important role in human genomics and precision medicine. Despite intense efforts over the years, the discovery of SVs in individuals remains challenging due to the diploid and highly repetitive structure of the human genome, and by the presence of SVs that vastly exceed sequencing read lengths. However, the recent introduction of low-error long-read sequencing technologies such as PacBio HiFi may finally enable these barriers to be overcome. Here we present SV discovery with sample-specific strings (SVDSS)—a method for discovery of SVs from long-read sequencing technologies (for example, PacBio HiFi) that combines and effectively leverages mapping-free, mapping-based and assembly-based methodologies for overall superior SV discovery performance. Our experiments on several human samples show that SVDSS outperforms state-of-the-art mapping-based methods for discovery of insertion and deletion SVs in PacBio HiFi reads and achieves notable improvements in calling SVs in repetitive regions of the genome.
Articolo in rivista - Articolo scientifico
Bioinformatics, Sequence Analysis, Structural Variations, PacBio HiFi
English
22-dic-2022
2023
20
4
550
558
open
Denti, L., Khorsand, P., Bonizzoni, P., Hormozdiari, F., Chikhi, R. (2023). SVDSS: structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads. NATURE METHODS, 20(4), 550-558 [10.1038/s41592-022-01674-1].
File in questo prodotto:
File Dimensione Formato  
Denti-2022-Nat Method-AAM.pdf

Accesso Aperto

Descrizione: Article
Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print)
Licenza: Altro
Dimensione 2.42 MB
Formato Adobe PDF
2.42 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/400015
Citazioni
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 5
Social impact