The concept of similarity and its dual concept of diversity play a fundamental role in several QSAR strategies, chemometrics and library searching methods, virtual screening, as well as in relatively new fields such as genomics and proteomics. In this paper, a new flexible similarity/diversity measure is proposed to deal with sequential data, both taking into account the differences in property values of the sequence elements and the ordering relationships among the sequence elements themselves. Data such as DNA sequences, mass and NMR spectra, sequential molecular descriptors are all characterized by an ordering variable (the sequence) and by a property of the sequence elements. Some examples on artificial DNA sequences, mass spectra, molecular descriptors and proteomic maps are given.
Todeschini, R., Ballabio, D., Consonni, V., Mauri, A. (2007). A new similarity/diversity measure for sequential data. MATCH, 57, 51-67.
A new similarity/diversity measure for sequential data
TODESCHINI, ROBERTO;BALLABIO, DAVIDE;CONSONNI, VIVIANA;MAURI, ANDREA
2007
Abstract
The concept of similarity and its dual concept of diversity play a fundamental role in several QSAR strategies, chemometrics and library searching methods, virtual screening, as well as in relatively new fields such as genomics and proteomics. In this paper, a new flexible similarity/diversity measure is proposed to deal with sequential data, both taking into account the differences in property values of the sequence elements and the ordering relationships among the sequence elements themselves. Data such as DNA sequences, mass and NMR spectra, sequential molecular descriptors are all characterized by an ordering variable (the sequence) and by a property of the sequence elements. Some examples on artificial DNA sequences, mass spectra, molecular descriptors and proteomic maps are given.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.