Bicocca Open Archive

Finding domain specific key terms/phrases from a given set of documents is a challenging task. A domain may be defined as an area of interest over a collection of documents which may not be explicitly defined but implicitly observable in those documents. When considering a collection of documents related to academic research, examples of key terms/phrases may be Information Retrieval", "Marine Biology", etc. In this paper a technique for extracting important key terms/phrases in a considered topical domain is proposed using external evidence from the titles of Wikipedia articles and the Wikipedia category graph. We performed some experiments over the document collection of Web sites of different post-graduate schools. Our preliminary evaluations show promising results for the detection of domain specific key terms/phrases from the given set of domain focused Web pages.

Qureshi, M., O'Riordan, C., Pasi, G. (2012). Short-text domain specific key terms/phrases extraction using an n-gram model with wikipedia. In Proceedings of the 21st ACM international conference on Information and knowledge management (pp.2515-2518). ACM [10.1145/2396761.2398680].

Short-text domain specific key terms/phrases extraction using an n-gram model with wikipedia

QURESHI, MUHAMMAD ATIF;O'Riordan, C;PASI, GABRIELLA

2012

Abstract

Finding domain specific key terms/phrases from a given set of documents is a challenging task. A domain may be defined as an area of interest over a collection of documents which may not be explicitly defined but implicitly observable in those documents. When considering a collection of documents related to academic research, examples of key terms/phrases may be Information Retrieval", "Marine Biology", etc. In this paper a technique for extracting important key terms/phrases in a considered topical domain is proposed using external evidence from the titles of Wikipedia articles and the Wikipedia category graph. We performed some experiments over the document collection of Web sites of different post-graduate schools. Our preliminary evaluations show promising results for the detection of domain specific key terms/phrases from the given set of domain focused Web pages.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				poster
			
	Parole chiave
	
				community detection; n-gram model; open-domain knowledge; wikipedia;
			
	Parole chiave
	
				Text Analysis, Information Retrieval
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				21st ACM International Conference on Information and Knowledge Management, CIKM 2012
			
	Anno del convegno
	
				2012
			
	Titolo degli atti
	
				Proceedings of the 21st ACM international conference on Information and knowledge management
			
	ISBN del volume degli atti
	
				9781450311564
			
	Data di pubblicazione
	
				2012
			
	Pagina iniziale
	
				2515
			
	Pagina finale
	
				2518
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/2396761.2398680
			
	Fulltext
	
				none
			
	Citazione
	
				Qureshi, M., O'Riordan, C., Pasi, G. (2012). Short-text domain specific key terms/phrases extraction using an n-gram model with wikipedia. In Proceedings of the 21st ACM international conference on Information and knowledge management (pp.2515-2518). ACM [10.1145/2396761.2398680].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/43415

Citazioni

16

ND

Social impact