It is well known that supervised text classification methods need to learn from many labeled examples to achieve a high accuracy. However, in a real context, sufficient labeled examples are not always available. In this paper we demonstrate that a way to obtain a high accuracy, when the number of labeled examples is low, is to consider structured features instead of list of weighted words as observed features. The proposed vector of features considers a hierarchical structure, named a mixed Graph of Terms, composed of a directed and an undirected sub-graph of words, that can be automatically constructed from a set of documents through the probabilistic Topic Model. © Springer-Verlag Berlin Heidelberg 2013.

Colace, F., De Santo, M., Greco, L., Napoletano, P. (2013). Learning to Classify Text Using a Few Labeled Examples. In Communications in Computer and Information Science (pp. 200-214). Springer Verlag [10.1007/978-3-642-37186-8_13].

Learning to Classify Text Using a Few Labeled Examples

NAPOLETANO, PAOLO
2013

Abstract

It is well known that supervised text classification methods need to learn from many labeled examples to achieve a high accuracy. However, in a real context, sufficient labeled examples are not always available. In this paper we demonstrate that a way to obtain a high accuracy, when the number of labeled examples is low, is to consider structured features instead of list of weighted words as observed features. The proposed vector of features considers a hierarchical structure, named a mixed Graph of Terms, composed of a directed and an undirected sub-graph of words, that can be automatically constructed from a set of documents through the probabilistic Topic Model. © Springer-Verlag Berlin Heidelberg 2013.
Capitolo o saggio
Probabilistic topic model; Term extraction; Text classification; Computer Science (all)
English
Communications in Computer and Information Science
2013
9783642371851
348
Springer Verlag
200
214
Colace, F., De Santo, M., Greco, L., Napoletano, P. (2013). Learning to Classify Text Using a Few Labeled Examples. In Communications in Computer and Information Science (pp. 200-214). Springer Verlag [10.1007/978-3-642-37186-8_13].
none
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/56745
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact