In this paper, a new similarity/diversity measure is proposed as a new approach to the analysis of sequential data, where useful information can be also obtained by the ordering relationships between the sequence elements. This methodology has been applied to characterize DNA sequences, evaluating their similarity/diversity. The new proposed distance (weighted standardized Hasse distance) is evaluated between pairs of Hasse matrices derived from the classical partial ordering rules. It can be naturally standardized, thus allowing the interpretation of these distances as absolute values (e.g. percentage) and deriving simple similarity and correlation indices. DNA sequences taken from the first exons of the beta-globins for eight different species have been analyzed. Sensitivity analysis has been also performed, showing the high capability of this measure to take into account small modifications of the DNA sequences. Finally, a comparison with results obtained from literature is given.

Todeschini, R., Ballabio, D., Consonni, V., Mauri, A. (2008). A New Similarity/Diversity Measure for the Characterization of DNA Sequences. CROATICA CHEMICA ACTA, 81(4), 657-664.

A New Similarity/Diversity Measure for the Characterization of DNA Sequences

TODESCHINI, ROBERTO;BALLABIO, DAVIDE;CONSONNI, VIVIANA;MAURI, ANDREA
2008

Abstract

In this paper, a new similarity/diversity measure is proposed as a new approach to the analysis of sequential data, where useful information can be also obtained by the ordering relationships between the sequence elements. This methodology has been applied to characterize DNA sequences, evaluating their similarity/diversity. The new proposed distance (weighted standardized Hasse distance) is evaluated between pairs of Hasse matrices derived from the classical partial ordering rules. It can be naturally standardized, thus allowing the interpretation of these distances as absolute values (e.g. percentage) and deriving simple similarity and correlation indices. DNA sequences taken from the first exons of the beta-globins for eight different species have been analyzed. Sensitivity analysis has been also performed, showing the high capability of this measure to take into account small modifications of the DNA sequences. Finally, a comparison with results obtained from literature is given.
Articolo in rivista - Articolo scientifico
DNA, partial ordering, Hasse matrix, distances, similarity/diversity, rank correlation
English
2008
81
4
657
664
open
Todeschini, R., Ballabio, D., Consonni, V., Mauri, A. (2008). A New Similarity/Diversity Measure for the Characterization of DNA Sequences. CROATICA CHEMICA ACTA, 81(4), 657-664.
File in questo prodotto:
File Dimensione Formato  
A_new_similarity.pdf

accesso aperto

Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print)
Dimensione 249.08 kB
Formato Adobe PDF
249.08 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/4927
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
Social impact