A considerable amount of data, presented in a structured form, is available on the Web nowadays. For the informational content of such data to be made accessible and understandable to users, its translation into text is preferable. This task is named 'data-to-text generation' in the state-of-the-art, and it is an instance of the Natural Language Generation. In order to generate some valuable text from data, also known as lexicalisation, some approaches have begun to consider the Resource Description Format (RDF) data present within the Knowledge Graphs. In this context, it is possible to identify two main categories of lexicalisation approaches that use neural networks: pipeline and end-to-end. The former has better performances but is more complex to adapt. The latter, the end-to-end systems, has much simpler architectures but is less precise. In this work, in order to get the best from the two categories, we propose a new hybrid approach, TripleEnc, which, thanks to the concept of vector similarity between RDF triples, identifies the best approach for lexicalisation. Empirical comparisons demonstrate that the novel approach improves the quality of the generated text.

Cremaschi, M., Saleri, S., Maurino, A. (2022). A geometrical deep learning model for the lexicalisation of 'unseen' RDF triples. In 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) (pp.2233-2240). Institute of Electrical and Electronics Engineers Inc. [10.1109/HPCC-DSS-SmartCity-DependSys53884.2021.00334].

A geometrical deep learning model for the lexicalisation of 'unseen' RDF triples

Cremaschi M.
;
Maurino A.
2022

Abstract

A considerable amount of data, presented in a structured form, is available on the Web nowadays. For the informational content of such data to be made accessible and understandable to users, its translation into text is preferable. This task is named 'data-to-text generation' in the state-of-the-art, and it is an instance of the Natural Language Generation. In order to generate some valuable text from data, also known as lexicalisation, some approaches have begun to consider the Resource Description Format (RDF) data present within the Knowledge Graphs. In this context, it is possible to identify two main categories of lexicalisation approaches that use neural networks: pipeline and end-to-end. The former has better performances but is more complex to adapt. The latter, the end-to-end systems, has much simpler architectures but is less precise. In this work, in order to get the best from the two categories, we propose a new hybrid approach, TripleEnc, which, thanks to the concept of vector similarity between RDF triples, identifies the best approach for lexicalisation. Empirical comparisons demonstrate that the novel approach improves the quality of the generated text.
paper
Lexicalisation; Natural Language Generation; RDF triples;
English
23rd IEEE International Conference on High Performance Computing and Communications, 7th IEEE International Conference on Data Science and Systems, 19th IEEE International Conference on Smart City and 7th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021 - 20-22 December 2021
2021
2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)
9781665494571
2022
2233
2240
reserved
Cremaschi, M., Saleri, S., Maurino, A. (2022). A geometrical deep learning model for the lexicalisation of 'unseen' RDF triples. In 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) (pp.2233-2240). Institute of Electrical and Electronics Engineers Inc. [10.1109/HPCC-DSS-SmartCity-DependSys53884.2021.00334].
File in questo prodotto:
File Dimensione Formato  
Cremaschi et al-2022-HPCC-DSS-SmartCity-DependSys-VoR.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 169.84 kB
Formato Adobe PDF
169.84 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/559282
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact