Mapping natural language terms to a Web knowledge base enriches information systems without additional context, with new relations and properties from the Linked Open Data. In this paper we formally define such task, which is related to word sense disambiguation, named entity recognition and ontology matching. We provide a manually annotated dataset of labels linked to DBpedia as a gold standard for evaluation, and we use it to experiment with a number of methods, including a novel algorithm that leverages the specific characteristics of the mapping task. The empirical evidence confirms that general term mapping is a hard task, that cannot be easily solved by applying existing methods designed for related problems. However, incorporating NLP ideas such as representing the context and a proper treatment of multiword expressions can significantly boost the performance, in particular the coverage of the mapping. Our findings open up the challenge to find new ways of approaching term mapping to Web resources and bridging the gap between natural language and the Semantic Web.
Basile, V., Cabrio, E., Gandon, F., Nozza, D. (2018). Mapping natural language labels to structured web resources. In Proceedings of the 2nd Workshop on Natural Language for Artificial Intelligence, NL4AI 2018 (pp.63-75). CEUR-WS.
Mapping natural language labels to structured web resources
Nozza, D
2018
Abstract
Mapping natural language terms to a Web knowledge base enriches information systems without additional context, with new relations and properties from the Linked Open Data. In this paper we formally define such task, which is related to word sense disambiguation, named entity recognition and ontology matching. We provide a manually annotated dataset of labels linked to DBpedia as a gold standard for evaluation, and we use it to experiment with a number of methods, including a novel algorithm that leverages the specific characteristics of the mapping task. The empirical evidence confirms that general term mapping is a hard task, that cannot be easily solved by applying existing methods designed for related problems. However, incorporating NLP ideas such as representing the context and a proper treatment of multiword expressions can significantly boost the performance, in particular the coverage of the mapping. Our findings open up the challenge to find new ways of approaching term mapping to Web resources and bridging the gap between natural language and the Semantic Web.File | Dimensione | Formato | |
---|---|---|---|
Basile-2018-NL4AI-VoR.pdf
accesso aperto
Descrizione: Intervento a convegno
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
1 MB
Formato
Adobe PDF
|
1 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.