Processing large-scale and highly interconnected Knowledge Graphs (KG) is becoming crucial for many applications such as recommender systems, question answering, etc. Profiling approaches have been proposed to summarize large KGs with the aim to produce concise and meaningful representation so that they can be easily managed. However, constructing profiles and calculating several statistics such as cardinality descriptors or inferences are resource expensive. In this paper, we present ABSTAT-HD, a highly distributed profiling tool that supports users in profiling and understanding big and complex knowledge graphs. We demonstrate the impact of the new architecture of ABSTAT-HD by presenting a set of experiments that show its scalability with respect to three dimensions of the data to be processed: size, complexity, and workload. The experimentation shows that our profiling framework provides informative and concise profiles, and can process and manage very large KGs.

Arturo Alva Principe, R., Maurino, A., Palmonari, M., Ciavotta, M., Spahiu, B. (2022). ABSTAT-HD: a scalable tool for profiling very large knowledge graphs. VLDB JOURNAL, 31(5), 851-875 [10.1007/s00778-021-00704-2].

ABSTAT-HD: a scalable tool for profiling very large knowledge graphs

Andrea Maurino
Secondo
;
Matteo Palmonari;Michele Ciavotta;Blerina Spahiu
Ultimo
2022

Abstract

Processing large-scale and highly interconnected Knowledge Graphs (KG) is becoming crucial for many applications such as recommender systems, question answering, etc. Profiling approaches have been proposed to summarize large KGs with the aim to produce concise and meaningful representation so that they can be easily managed. However, constructing profiles and calculating several statistics such as cardinality descriptors or inferences are resource expensive. In this paper, we present ABSTAT-HD, a highly distributed profiling tool that supports users in profiling and understanding big and complex knowledge graphs. We demonstrate the impact of the new architecture of ABSTAT-HD by presenting a set of experiments that show its scalability with respect to three dimensions of the data to be processed: size, complexity, and workload. The experimentation shows that our profiling framework provides informative and concise profiles, and can process and manage very large KGs.
Articolo in rivista - Articolo scientifico
Data management; Data profiling; Data quality; Distributed processing engine; Knowledge graph;
English
29-set-2021
2022
31
5
851
875
reserved
Arturo Alva Principe, R., Maurino, A., Palmonari, M., Ciavotta, M., Spahiu, B. (2022). ABSTAT-HD: a scalable tool for profiling very large knowledge graphs. VLDB JOURNAL, 31(5), 851-875 [10.1007/s00778-021-00704-2].
File in questo prodotto:
File Dimensione Formato  
PVLDB_2021 (8).pdf

Solo gestori archivio

Tipologia di allegato: Submitted Version (Pre-print)
Dimensione 1.25 MB
Formato Adobe PDF
1.25 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/327617
Citazioni
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
Social impact