Bicocca Open Archive

Using Twitter as an effective marketing tool has become a gold mine for companies interested in their online reputation. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to company names. In fact, finding if a particular tweet is relevant or irrelevant to a company is an important task not satisfactorily solved yet; to address this issue in this paper we propose a Wikipedia-based two-step filtering algorithm. As opposed to most other methods, the proposed approach is fully automatic and does not rely on hand-coded rules. The first step is a precision-oriented pass that uses Wikipedia as an external knowledge source to extract pertinent terms and phrases from certain parts of company Wikipedia pages, and use these as weighted filters to identify tweets about a given company. The second pass expands the first to increase recall by including more terms from URLs in tweets, Twitter user profile information and hashtags. The approach is evaluated on a CLEF lab dataset, showing good performance - especially for English tweets.

Qureshi, M., Younus, A., O’Riordan, C., Pasi, G. (2015). Company name disambiguation in tweets: A two-step filtering approach. In Information Retrieval Technology (pp.358-365). GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Verlag [10.1007/978-3-319-28940-3_28].

Company name disambiguation in tweets: A two-step filtering approach

Qureshi, MA;Younus, A;O’Riordan, C;Pasi, G

2015

Abstract

Using Twitter as an effective marketing tool has become a gold mine for companies interested in their online reputation. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to company names. In fact, finding if a particular tweet is relevant or irrelevant to a company is an important task not satisfactorily solved yet; to address this issue in this paper we propose a Wikipedia-based two-step filtering algorithm. As opposed to most other methods, the proposed approach is fully automatic and does not rely on hand-coded rules. The first step is a precision-oriented pass that uses Wikipedia as an external knowledge source to extract pertinent terms and phrases from certain parts of company Wikipedia pages, and use these as weighted filters to identify tweets about a given company. The second pass expands the first to increase recall by including more terms from URLs in tweets, Twitter user profile information and hashtags. The approach is evaluated on a CLEF lab dataset, showing good performance - especially for English tweets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				slide + paper
			
	Parole chiave
	
				Theoretical Computer Science; Computer Science (all)
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				Asia Information Retrieval Societies Conference, AIRS 2015
			
	Anno del convegno
	
				2015
			
	Titolo degli atti
	
				Information Retrieval Technology
			
	ISBN del volume degli atti
	
				9783319289397
			
	Collana o serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Data di pubblicazione
	
				2015
			
	Numero del volume
	
				9460
			
	Pagina iniziale
	
				358
			
	Pagina finale
	
				365
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-3-319-28940-3_28
			
	Fulltext
	
				none
			
	Citazione
	
				Qureshi, M., Younus, A., O’Riordan, C., Pasi, G. (2015). Company name disambiguation in tweets: A two-step filtering approach. In Information Retrieval Technology (pp.358-365). GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Verlag [10.1007/978-3-319-28940-3_28].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/188074

Citazioni

0

0

Social impact