Bicocca Open Archive

The steady increase of information on WWW, digital library, portal, database and local intranet, gave rise to the development of several methods to help user in Information Retrieval, information organization and browsing. Clustering algorithms are of crucial importance when there are no labels associated to textual information or documents. The aim of clustering algorithms, in the text mining domain, is to group documents concerning with the same topic into the same cluster, producing a flat or hierarchical structure of clusters. In this paper we present a Knowledge Discovery System for document processing and clustering. The clustering algorithm implemented in this system, called Induced Bisecting k-Means, outperforms the Standard Bisecting k-Means and is particularly suitable for on line applications when computational efficiency is a crucial aspect.

Archetti, F., Fersini, E., Campanelli, P., Messina, V. (2006). A hierarchical document clustering environment based on the induced bisecting k-means. In Flexible Query Answering Systems (eds. H. L. Larsen, G. Pasi, D. O. Arroyo, T. Andreasen, H. Christiansen) (pp.257-269) [10.1007/11766254_22].

A hierarchical document clustering environment based on the induced bisecting k-means

ARCHETTI, FRANCESCO ANTONIO;FERSINI, ELISABETTA;CAMPANELLI, PIERO;MESSINA, VINCENZINA

2006

Abstract

The steady increase of information on WWW, digital library, portal, database and local intranet, gave rise to the development of several methods to help user in Information Retrieval, information organization and browsing. Clustering algorithms are of crucial importance when there are no labels associated to textual information or documents. The aim of clustering algorithms, in the text mining domain, is to group documents concerning with the same topic into the same cluster, producing a flat or hierarchical structure of clusters. In this paper we present a Knowledge Discovery System for document processing and clustering. The clustering algorithm implemented in this system, called Induced Bisecting k-Means, outperforms the Standard Bisecting k-Means and is particularly suitable for on line applications when computational efficiency is a crucial aspect.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				slide + paper
			
	Parole chiave
	
				hierarchical clustering,Induced Bisecting k-Means, document clustering
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				Flexibility in Database Management and Querying
			
	Anno del convegno
	
				2006
			
	Titolo degli atti
	
				Flexible Query Answering Systems (eds. H. L. Larsen, G. Pasi, D. O. Arroyo, T. Andreasen, H. Christiansen)
			
	ISBN del volume degli atti
	
				978-3-540-34638-8
			
	Collana o serie
	
				Lecture Notes in Computer Science
			
	Data di pubblicazione
	
				2006
			
	Numero del volume
	
				4027
			
	Pagina iniziale
	
				257
			
	Pagina finale
	
				269
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/11766254_22
			
	URL alternativo
	
				http://www.springerlink.com/content/f6w07qj750658r27/
			
	Fulltext
	
				none
			
	Citazione
	
				Archetti, F., Fersini, E., Campanelli, P., Messina, V. (2006). A hierarchical document clustering environment based on the induced bisecting k-means. In Flexible Query Answering Systems (eds. H. L. Larsen, G. Pasi, D. O. Arroyo, T. Andreasen, H. Christiansen) (pp.257-269) [10.1007/11766254_22].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/5512

Citazioni

26

14

Social impact