This paper presents GraphDBLP, a system that models the DBLP bibliography as a graph database for performing graph-based queries and social network analyses. GraphDBLP also enriches the DBLP data through semantic keyword similarities computed via word-embedding. In this paper, we discuss how the system was formalized as a multi-graph, and how similarity relations were identified through word2vec. We also provide three meaningful queries for exploring the DBLP community to (i) investigate author profiles by analysing their publication records; (ii) identify the most prolific authors on a given topic, and (iii) perform social network analyses over the whole community. To date, GraphDBLP contains 5+ million nodes and 24+ million relationships, enabling users to explore the DBLP data by referencing more than 3.3 million publications, 1.7 million authors, and more than 5 thousand publication venues. Through the use of word-embedding, more than 7.5 thousand keywords and related similarity values were collected. GraphDBLP was implemented on top of the Neo4j graph database. The whole dataset and the source code are publicly available to foster the improvement of GraphDBLP in the whole computer science community.

Mezzanzanica, M., Mercorio, F., Cesarini, M., Moscato, V., Picariello, A. (2018). GraphDBLP: a system for analysing networks of computer scientists through graph databases: GraphDBLP. MULTIMEDIA TOOLS AND APPLICATIONS, 77(14), 18657-18688 [10.1007/s11042-017-5503-2].

GraphDBLP: a system for analysing networks of computer scientists through graph databases: GraphDBLP

Mezzanzanica, M;Mercorio, F
;
Cesarini, M;
2018

Abstract

This paper presents GraphDBLP, a system that models the DBLP bibliography as a graph database for performing graph-based queries and social network analyses. GraphDBLP also enriches the DBLP data through semantic keyword similarities computed via word-embedding. In this paper, we discuss how the system was formalized as a multi-graph, and how similarity relations were identified through word2vec. We also provide three meaningful queries for exploring the DBLP community to (i) investigate author profiles by analysing their publication records; (ii) identify the most prolific authors on a given topic, and (iii) perform social network analyses over the whole community. To date, GraphDBLP contains 5+ million nodes and 24+ million relationships, enabling users to explore the DBLP data by referencing more than 3.3 million publications, 1.7 million authors, and more than 5 thousand publication venues. Through the use of word-embedding, more than 7.5 thousand keywords and related similarity values were collected. GraphDBLP was implemented on top of the Neo4j graph database. The whole dataset and the source code are publicly available to foster the improvement of GraphDBLP in the whole computer science community.
Articolo in rivista - Articolo scientifico
Graph database; Knowledge extraction; Semantic analytics; Social network analysis; Word embedding;
graph database; word embedding; knowledge extraction; Semantic Analytics; social network analysis
English
2018
77
14
18657
18688
reserved
Mezzanzanica, M., Mercorio, F., Cesarini, M., Moscato, V., Picariello, A. (2018). GraphDBLP: a system for analysing networks of computer scientists through graph databases: GraphDBLP. MULTIMEDIA TOOLS AND APPLICATIONS, 77(14), 18657-18688 [10.1007/s11042-017-5503-2].
File in questo prodotto:
File Dimensione Formato  
2018_Article_.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 3.09 MB
Formato Adobe PDF
3.09 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/176732
Citazioni
  • Scopus 23
  • ???jsp.display-item.citation.isi??? 12
Social impact