Textual data forms a popular form of communication; however, textual data is complex in nature as it is produced by humans. Given the huge amount of textual data currently available, it is essential to be able to mine this data automatically. Recent text mining efforts are making extensive use of knowledge bases, and this thesis pursues a similar effort. We however make use of Wikipedia to solve complex text mining tasks and current approaches do not make effective use of the category-article structure within Wikipedia. Particularly, we solve the problem of determining various topical threads in a document together with contextualization of social media content to disambiguate its various aspects. Experimental evaluations demonstrate the superiroty of our proposed methods when compared with state-of-the-art.
(2015). Utilizing Wikipedia for Text Mining Applications. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).
Utilizing Wikipedia for Text Mining Applications
QURESHI, MUHAMMAD ATIF
2015
Abstract
Textual data forms a popular form of communication; however, textual data is complex in nature as it is produced by humans. Given the huge amount of textual data currently available, it is essential to be able to mine this data automatically. Recent text mining efforts are making extensive use of knowledge bases, and this thesis pursues a similar effort. We however make use of Wikipedia to solve complex text mining tasks and current approaches do not make effective use of the category-article structure within Wikipedia. Particularly, we solve the problem of determining various topical threads in a document together with contextualization of social media content to disambiguate its various aspects. Experimental evaluations demonstrate the superiroty of our proposed methods when compared with state-of-the-art.File | Dimensione | Formato | |
---|---|---|---|
phd_unimib_761334.pdf
accesso aperto
Descrizione: Tesi dottorato
Tipologia di allegato:
Doctoral thesis
Dimensione
2.51 MB
Formato
Adobe PDF
|
2.51 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.