In recent years, there has been an increasing interest in extracting and annotating tables on the Web. This activity allows the transformation of text data into machine-readable formats to enable the execution of various artificial intelligence tasks, e.g. semantic search and dataset extension. Semantic Table Interpretation is the process of annotating elements in a table. Current approaches are mainly based on lexical matching algorithms that rely on metadata associated with tables or custom Knowledge Graphs. Their main limitations are due to the lack of metadata, the little use of contextual semantics, and the incompleteness of the proposed methods that do not include all the necessary steps. In this paper, we propose a comprehensive approach and a tool that provides an unsupervised method to annotate independent tables, possibly without header row or other external information. The approach is based on the definition of a context created from the elements within the table in order to discriminate among matching entities found in shared Knowledge Graphs and create high-quality annotations. The approach has achieved excellent results in an international challenge, thus proving its effectiveness.

Cremaschi, M., De Paoli, F., Rula, A., Spahiu, B. (2020). A fully automated approach to a complete Semantic Table Interpretation. FUTURE GENERATION COMPUTER SYSTEMS, 112, 478-500 [10.1016/j.future.2020.05.019].

A fully automated approach to a complete Semantic Table Interpretation

Cremaschi M.
;
De Paoli F.;Rula A.;Spahiu B.
2020

Abstract

In recent years, there has been an increasing interest in extracting and annotating tables on the Web. This activity allows the transformation of text data into machine-readable formats to enable the execution of various artificial intelligence tasks, e.g. semantic search and dataset extension. Semantic Table Interpretation is the process of annotating elements in a table. Current approaches are mainly based on lexical matching algorithms that rely on metadata associated with tables or custom Knowledge Graphs. Their main limitations are due to the lack of metadata, the little use of contextual semantics, and the incompleteness of the proposed methods that do not include all the necessary steps. In this paper, we propose a comprehensive approach and a tool that provides an unsupervised method to annotate independent tables, possibly without header row or other external information. The approach is based on the definition of a context created from the elements within the table in order to discriminate among matching entities found in shared Knowledge Graphs and create high-quality annotations. The approach has achieved excellent results in an international challenge, thus proving its effectiveness.
Articolo in rivista - Articolo scientifico
Knowledge Graph; Linked Data; Ontology; Semantic Table Interpretation; Semantic Web;
English
27-mag-2020
2020
112
478
500
partially_open
Cremaschi, M., De Paoli, F., Rula, A., Spahiu, B. (2020). A fully automated approach to a complete Semantic Table Interpretation. FUTURE GENERATION COMPUTER SYSTEMS, 112, 478-500 [10.1016/j.future.2020.05.019].
File in questo prodotto:
File Dimensione Formato  
Cremaschi-2020-Future Generat Comput Systems-VoR.pdf

Solo gestori archivio

Descrizione: Research Article
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 2.36 MB
Formato Adobe PDF
2.36 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Cremaschi-2020-Future Generat Comput Systems-AAM.pdf

accesso aperto

Descrizione: Research Article
Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print)
Licenza: Creative Commons
Dimensione 918.9 kB
Formato Adobe PDF
918.9 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/283648
Citazioni
  • Scopus 38
  • ???jsp.display-item.citation.isi??? 13
Social impact