Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts are in need of automated methods for mining the social media repositories (in particular Twitter) to monitor the reputation of a particular entity. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to entity names. To address this issue in this paper we use "context phrases" in a tweet and Wikipedia disambiguated articles for a particular entity in a random forest classifier. Furthermore, we also utilize the concept of "relatedness" between tweet and entity using the Wikipedia category-article structure that captures the amount of discussion present inside a tweet related to an entity. The experimental evaluations show a significant improvement over the baseline and comparable performance with other systems representing strong performance given that we restrict ourselves to features extracted from Wikipedia. © Springer International Publishing Switzerland 2014.

Qureshi, M., O'Riordan, C., Pasi, G. (2014). Exploiting wikipedia for entity name disambiguation in tweets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp.184-195). Springer Verlag [10.1007/978-3-319-07983-7_25].

Exploiting wikipedia for entity name disambiguation in tweets

QURESHI, MUHAMMAD ATIF
Primo
;
PASI, GABRIELLA
Ultimo
2014

Abstract

Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts are in need of automated methods for mining the social media repositories (in particular Twitter) to monitor the reputation of a particular entity. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to entity names. To address this issue in this paper we use "context phrases" in a tweet and Wikipedia disambiguated articles for a particular entity in a random forest classifier. Furthermore, we also utilize the concept of "relatedness" between tweet and entity using the Wikipedia category-article structure that captures the amount of discussion present inside a tweet related to an entity. The experimental evaluations show a significant improvement over the baseline and comparable performance with other systems representing strong performance given that we restrict ourselves to features extracted from Wikipedia. © Springer International Publishing Switzerland 2014.
paper
Computer Science (all); Theoretical Computer Science
English
International Conference on Applications of Natural Language to Information Systems, NLDB - June 18-20
2014
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
9783319079820
2014
8455
184
195
http://www.springer.com/gp/book/9783319079820
none
Qureshi, M., O'Riordan, C., Pasi, G. (2014). Exploiting wikipedia for entity name disambiguation in tweets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp.184-195). Springer Verlag [10.1007/978-3-319-07983-7_25].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/58504
Citazioni
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
Social impact