In the last few decades, the use of similarity measures has been becoming more and more important due to the relevance of comparing samples in order to find out clusters of similar samples, to generate priority lists, and, in general, to discover patterns in data structures. In drug design, their relevance is already well established to search for the most suitable alternative to a target drug. In the QSAR field they are currently the key factor in read-accross strategy along with the defined chemical space. Similarity indices for binary variables are usually called similarity coefficients and their first definitions date back to the end of the XIX° century provided by scientists specially interested in taxonomic studies. Till date, more than 50 different similarity coefficients have been found in the literature, each having its own mathematical properties and characteristics and used in different scientific fields. In this paper, five new similarity coefficients for binary data are proposed and compared with some well-known similarity coefficients.
Consonni, V., Todeschini, R. (2012). New similarity coefficients for binary data. MATCH, 68(2), 581-592.
New similarity coefficients for binary data
CONSONNI, VIVIANA;TODESCHINI, ROBERTO
2012
Abstract
In the last few decades, the use of similarity measures has been becoming more and more important due to the relevance of comparing samples in order to find out clusters of similar samples, to generate priority lists, and, in general, to discover patterns in data structures. In drug design, their relevance is already well established to search for the most suitable alternative to a target drug. In the QSAR field they are currently the key factor in read-accross strategy along with the defined chemical space. Similarity indices for binary variables are usually called similarity coefficients and their first definitions date back to the end of the XIX° century provided by scientists specially interested in taxonomic studies. Till date, more than 50 different similarity coefficients have been found in the literature, each having its own mathematical properties and characteristics and used in different scientific fields. In this paper, five new similarity coefficients for binary data are proposed and compared with some well-known similarity coefficients.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.