Using Twitter as an effective marketing tool has become a gold mine for companies interested in their online reputation. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to company names. In fact, finding if a particular tweet is relevant or irrelevant to a company is an important task not satisfactorily solved yet; to address this issue in this paper we propose a Wikipedia-based two-step filtering algorithm. As opposed to most other methods, the proposed approach is fully automatic and does not rely on hand-coded rules. The first step is a precision-oriented pass that uses Wikipedia as an external knowledge source to extract pertinent terms and phrases from certain parts of company Wikipedia pages, and use these as weighted filters to identify tweets about a given company. The second pass expands the first to increase recall by including more terms from URLs in tweets, Twitter user profile information and hashtags. The approach is evaluated on a CLEF lab dataset, showing good performance - especially for English tweets.

Qureshi, M., Younus, A., O’Riordan, C., Pasi, G. (2015). Company name disambiguation in tweets: A two-step filtering approach. In Information Retrieval Technology (pp.358-365). GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Verlag [10.1007/978-3-319-28940-3_28].

Company name disambiguation in tweets: A two-step filtering approach

Qureshi, MA
;
Younus, A;Pasi, G
2015

Abstract

Using Twitter as an effective marketing tool has become a gold mine for companies interested in their online reputation. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to company names. In fact, finding if a particular tweet is relevant or irrelevant to a company is an important task not satisfactorily solved yet; to address this issue in this paper we propose a Wikipedia-based two-step filtering algorithm. As opposed to most other methods, the proposed approach is fully automatic and does not rely on hand-coded rules. The first step is a precision-oriented pass that uses Wikipedia as an external knowledge source to extract pertinent terms and phrases from certain parts of company Wikipedia pages, and use these as weighted filters to identify tweets about a given company. The second pass expands the first to increase recall by including more terms from URLs in tweets, Twitter user profile information and hashtags. The approach is evaluated on a CLEF lab dataset, showing good performance - especially for English tweets.
slide + paper
Theoretical Computer Science; Computer Science (all)
English
Asia Information Retrieval Societies Conference, AIRS 2015
2015
Information Retrieval Technology
9783319289397
2015
9460
358
365
none
Qureshi, M., Younus, A., O’Riordan, C., Pasi, G. (2015). Company name disambiguation in tweets: A two-step filtering approach. In Information Retrieval Technology (pp.358-365). GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Verlag [10.1007/978-3-319-28940-3_28].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/188074
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact