A simple hierarchical clustering algorithm called CLUBS (for CLustering Using Binary Splitting) is proposed. CLUBS is faster and more accurate than existing algorithms, including k-means and its recently proposed refinements. The algorithm consists of a divisive phase and an agglomerative phase; during these two phases, the samples are repartitioned using a least quadratic distance criterion possessing unique analytical properties that we exploit to achieve a very fast computation. CLUBS derives good clusters without requiring input from users, and it is robust and impervious to noise, while providing better speed and accuracy than methods, such as BIRCH, that are endowed with the same critical properties. © Springer-Verlag 2013.

Masciari, E., Mazzeo, G., Zaniolo, C. (2013). A new, fast and accurate algorithm for hierarchical clustering on Euclidean distances. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 111-122). Springer Berlin Heidelberg [10.1007/978-3-642-37456-2_10].

A new, fast and accurate algorithm for hierarchical clustering on Euclidean distances

MAZZEO, GIUSEPPE MASSIMILIANO
Secondo
;
2013

Abstract

A simple hierarchical clustering algorithm called CLUBS (for CLustering Using Binary Splitting) is proposed. CLUBS is faster and more accurate than existing algorithms, including k-means and its recently proposed refinements. The algorithm consists of a divisive phase and an agglomerative phase; during these two phases, the samples are repartitioned using a least quadratic distance criterion possessing unique analytical properties that we exploit to achieve a very fast computation. CLUBS derives good clusters without requiring input from users, and it is robust and impervious to noise, while providing better speed and accuracy than methods, such as BIRCH, that are endowed with the same critical properties. © Springer-Verlag 2013.
Capitolo o saggio
Computer Science (all); Theoretical Computer Science
English
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2013
9783642374555
7819
Springer Berlin Heidelberg
111
122
Masciari, E., Mazzeo, G., Zaniolo, C. (2013). A new, fast and accurate algorithm for hierarchical clustering on Euclidean distances. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 111-122). Springer Berlin Heidelberg [10.1007/978-3-642-37456-2_10].
none
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/75852
Citazioni
  • Scopus 11
  • ???jsp.display-item.citation.isi??? ND
Social impact