An algorithm for the automatic labeling of topics, accordingly to a hierarchy, is presented. Its main ingredients are a set of similarity measures and a set of topics labeling rules. The labeling rules are specifically designed to find the most agreed labels between the given topic and the hierarchy. The hierarchy is obtained from the Google Directory service, extracted via an ad-hoc developed software procedure and expanded through the use of the OpenOffice English Thesaurus. The performance of the proposed algorithm is investigated by using a document corpus consisting of 33,801 documents and a dictionary consisting of 111,795 words. The results are encouraging while particularly interesting and significant labeling cases emerged.

Stella, F., Ciucci, D., Calegari, S., Magatti, D. (2009). Automatic labeling of topics. In An Efficient Combinatorial Approach for Solving the DNA Motif Finding Problem (pp.1227-1232). IEEE [10.1109/ISDA.2009.165].

Automatic labeling of topics

STELLA, FABIO ANTONIO;CIUCCI, DAVIDE ELIO;CALEGARI, SILVIA;MAGATTI, DAVIDE
2009

Abstract

An algorithm for the automatic labeling of topics, accordingly to a hierarchy, is presented. Its main ingredients are a set of similarity measures and a set of topics labeling rules. The labeling rules are specifically designed to find the most agreed labels between the given topic and the hierarchy. The hierarchy is obtained from the Google Directory service, extracted via an ad-hoc developed software procedure and expanded through the use of the OpenOffice English Thesaurus. The performance of the proposed algorithm is investigated by using a document corpus consisting of 33,801 documents and a dictionary consisting of 111,795 words. The results are encouraging while particularly interesting and significant labeling cases emerged.
paper
Automatic Topic Labeling, Topics Tree, Latent Dirichlet Allocation
English
9th International Conference on Intelligent Systems Design and Applications, ISDA 2009
2009
An Efficient Combinatorial Approach for Solving the DNA Motif Finding Problem
9780769538723
2009
1227
1232
5364126
none
Stella, F., Ciucci, D., Calegari, S., Magatti, D. (2009). Automatic labeling of topics. In An Efficient Combinatorial Approach for Solving the DNA Motif Finding Problem (pp.1227-1232). IEEE [10.1109/ISDA.2009.165].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/8357
Citazioni
  • Scopus 70
  • ???jsp.display-item.citation.isi??? 37
Social impact