Textual data forms a popular form of communication; however, textual data is complex in nature as it is produced by humans. Given the huge amount of textual data currently available, it is essential to be able to mine this data automatically. Recent text mining efforts are making extensive use of knowledge bases, and this thesis pursues a similar effort. We however make use of Wikipedia to solve complex text mining tasks and current approaches do not make effective use of the category-article structure within Wikipedia. Particularly, we solve the problem of determining various topical threads in a document together with contextualization of social media content to disambiguate its various aspects. Experimental evaluations demonstrate the superiroty of our proposed methods when compared with state-of-the-art.

(2015). Utilizing Wikipedia for Text Mining Applications. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).

Utilizing Wikipedia for Text Mining Applications

QURESHI, MUHAMMAD ATIF
2015

Abstract

Textual data forms a popular form of communication; however, textual data is complex in nature as it is produced by humans. Given the huge amount of textual data currently available, it is essential to be able to mine this data automatically. Recent text mining efforts are making extensive use of knowledge bases, and this thesis pursues a similar effort. We however make use of Wikipedia to solve complex text mining tasks and current approaches do not make effective use of the category-article structure within Wikipedia. Particularly, we solve the problem of determining various topical threads in a document together with contextualization of social media content to disambiguate its various aspects. Experimental evaluations demonstrate the superiroty of our proposed methods when compared with state-of-the-art.
PASI, GABRIELLA
O'RIORDAN, COLM
Wikipedia, text mining, semantic relatedness, category-article structure, text classification, perspective-aware search
INF/01 - INFORMATICA
English
28-ott-2015
Scuola di dottorato di Scienze
INFORMATICA - 22R
27
2013/2014
National University of Ireland, Galway
open
(2015). Utilizing Wikipedia for Text Mining Applications. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).
File in questo prodotto:
File Dimensione Formato  
phd_unimib_761334.pdf

accesso aperto

Descrizione: Tesi dottorato
Tipologia di allegato: Doctoral thesis
Dimensione 2.51 MB
Formato Adobe PDF
2.51 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/91081
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact