Bicocca Open Archive

A research area of NLP is known as keyphrases extraction, which aims to identify words and expressions in a text that comprehensively represent the content of the text itself. In this study, we introduce a new approach called KRAKEN (Keyphrease extRAction maKing use of EmbeddiNgs). Our method takes advantage of widely used NLP techniques to extract keyphrases from a text in an unsupervised manner and we compare the results with well-known benchmark datasets in the literature. The main contribution of this work is developing a novel approach for keyphrase extraction. Both natural language text preprocessing techniques and distributional semantics techniques, such as word embeddings, are used to obtain a vector representation of the texts that maintains their semantic meaning. Through KRAKEN, we propose and design a new method that exploits word embedding for identifying keyphrases, considering the relationship among words in the text. To evaluate KRAKEN, we employ benchmark datasets and compare our approach with state-of-the-art methods. Another contribution of this work is the introduction of a metric to rank the identified keyphrases, considering the relatedness of both the words within the phrases and all the extracted phrases from the same text.

D'Amico, S., Malandri, L., Mercorio, F., Mezzanzanica, M. (2023). KRAKEN: A Novel Semantic-Based Approach for Keyphrases Extraction. In International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K - Proceedings (pp.289-297). Science and Technology Publications, Lda [10.5220/0012179500003598].

KRAKEN: A Novel Semantic-Based Approach for Keyphrases Extraction

D'Amico S.;Malandri L.;Mercorio F.;Mezzanzanica M.

2023

Abstract

A research area of NLP is known as keyphrases extraction, which aims to identify words and expressions in a text that comprehensively represent the content of the text itself. In this study, we introduce a new approach called KRAKEN (Keyphrease extRAction maKing use of EmbeddiNgs). Our method takes advantage of widely used NLP techniques to extract keyphrases from a text in an unsupervised manner and we compare the results with well-known benchmark datasets in the literature. The main contribution of this work is developing a novel approach for keyphrase extraction. Both natural language text preprocessing techniques and distributional semantics techniques, such as word embeddings, are used to obtain a vector representation of the texts that maintains their semantic meaning. Through KRAKEN, we propose and design a new method that exploits word embedding for identifying keyphrases, considering the relationship among words in the text. To evaluate KRAKEN, we employ benchmark datasets and compare our approach with state-of-the-art methods. Another contribution of this work is the introduction of a metric to rank the identified keyphrases, considering the relatedness of both the words within the phrases and all the extracted phrases from the same text.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Keyphrases Benchmark Evaluation; Keyphrases Evaluation; Keyphrases Extraction; Natural Language Processing; Word Embeddings;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				15th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2023 as part of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2023 - 13 November 2023 through 15 November 2023
			
	Anno del convegno
	
				2023
			
	Curatori della monografia
	
				Fred, A; Coenen, F; Bernardino, J
			
	Titolo degli atti
	
				International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K - Proceedings
			
	ISBN del volume degli atti
	
				9789897586712
			
	Collana o serie
	
				IC3K
			
	Data di pubblicazione
	
				2023
			
	Numero del volume
	
				1
			
	Pagina iniziale
	
				289
			
	Pagina finale
	
				297
			
	DOI dell'intervento
	
				https://dx.doi.org/10.5220/0012179500003598
			
	Fulltext
	
				none
			
	Citazione
	
				D'Amico, S., Malandri, L., Mercorio, F., Mezzanzanica, M. (2023). KRAKEN: A Novel Semantic-Based Approach for Keyphrases Extraction. In International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K - Proceedings (pp.289-297). Science and Technology Publications, Lda [10.5220/0012179500003598].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/455025

Citazioni

0

ND

Social impact