Clustering is an area of unsupervised machine learning where a computational algorithm groups together similar points into clusters in a meaningful way, according to the algorithm’s properties. When external ground truth for the clustering results assessment is available, researchers can employ an external clustering assessment metrics and evaluate the quality of the clustering results this way. When no external gold standard is available, however, researchers need to use metrics for internal clustering assessment, which produce an outcome just considering the internal data points of the clusters identified. Although consensus regarding the usage of the adjusted Rand index for the external clustering assessment exists, there is no standard regarding internal metrics. We fill this gap by presenting this study on comparing the six internal metrics clustering most commonly used in bioinformatics and health informatics: Silhouette coefficient, Davies-Bouldin index, Dunn index, Calinski-Harabasz index, Shannon entropy, and Gap statistic. We first analyze their mathematical properties, and then test them on the results of k-means with k = 2 clusters on multiple different convex-shaped artificial datasets and on five real-world open medical datasets of electronic health records. Our results show that the Silhouette coefficient and the Davies-Bouldin index are more informative and reliable than the other analyzed rates, when assessing convex-shaped and non-nested clusters in the Euclidean space.

Chicco, D., Campagner, A., Spagnolo, A., Ciucci, D., Jurman, G. (2025). The Silhouette coefficient and the Davies-Bouldin index are more informative than Dunn index, Calinski-Harabasz index, Shannon entropy, and Gap statistic for unsupervised clustering internal evaluation of two convex clusters. PEERJ. COMPUTER SCIENCE., 11 [10.7717/peerj-cs.3309].

The Silhouette coefficient and the Davies-Bouldin index are more informative than Dunn index, Calinski-Harabasz index, Shannon entropy, and Gap statistic for unsupervised clustering internal evaluation of two convex clusters

Chicco D.
Primo
;
Campagner A.;Ciucci D.;
2025

Abstract

Clustering is an area of unsupervised machine learning where a computational algorithm groups together similar points into clusters in a meaningful way, according to the algorithm’s properties. When external ground truth for the clustering results assessment is available, researchers can employ an external clustering assessment metrics and evaluate the quality of the clustering results this way. When no external gold standard is available, however, researchers need to use metrics for internal clustering assessment, which produce an outcome just considering the internal data points of the clusters identified. Although consensus regarding the usage of the adjusted Rand index for the external clustering assessment exists, there is no standard regarding internal metrics. We fill this gap by presenting this study on comparing the six internal metrics clustering most commonly used in bioinformatics and health informatics: Silhouette coefficient, Davies-Bouldin index, Dunn index, Calinski-Harabasz index, Shannon entropy, and Gap statistic. We first analyze their mathematical properties, and then test them on the results of k-means with k = 2 clusters on multiple different convex-shaped artificial datasets and on five real-world open medical datasets of electronic health records. Our results show that the Silhouette coefficient and the Davies-Bouldin index are more informative and reliable than the other analyzed rates, when assessing convex-shaped and non-nested clusters in the Euclidean space.
Articolo in rivista - Articolo scientifico
Algorithms and Analysis of Algorithms; Artificial Intelligence; Calinski-Harabasz index; Clustering; Clustering internal metrics; Data Mining and Machine Learning; Data Science; Databases; Davies- Bouldin index; Dunn index; Gap statistic; Shannon entropy; Silhouette coefficient;
English
21-nov-2025
2025
11
e3309
open
Chicco, D., Campagner, A., Spagnolo, A., Ciucci, D., Jurman, G. (2025). The Silhouette coefficient and the Davies-Bouldin index are more informative than Dunn index, Calinski-Harabasz index, Shannon entropy, and Gap statistic for unsupervised clustering internal evaluation of two convex clusters. PEERJ. COMPUTER SCIENCE., 11 [10.7717/peerj-cs.3309].
File in questo prodotto:
File Dimensione Formato  
Chicco et al-2025-PeerJ Computer Science-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 3.15 MB
Formato Adobe PDF
3.15 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/582043
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact