In this paper we introduce GraphDBLP, a tool that models the DBLP bibliography as a graph, and enriches the DBLP data through semantic keyword similarities computed via word-embedding. GraphDBLP has been implemented on top of the Neo4j graph-database, and it can be queried through the Cypher query language. We also provide three meaningful queries for exploring the DBLP community to (i) investigate author profiles by analysing their publication records; (ii) identify the most prolific authors on a given topic,and (iii) perform social network analyses over the whole community. GraphDBLP is available on Github. To date, it contains 5+ million nodes and 24+ million relationships, enabling users to explore the DBLP data by referencing more than 3.3 million publications, 1.7 million authors and more than 5 thousand publication venues. Thanks to the use of word-embedding, more than 7.5 thousand keywords and related similarity values were collected.
Cesarini, M., Mercorio, F., Mezzanzanica, M., Moscato, V., Picariello, A. (2018). GraphDBLP Released: Querying the Computer Scientists Network as a Graph. In Proceedings of the 26th Italian Symposium on Advanced Database Systems (SEBD 2018) (pp.1). CEUR-WS.
GraphDBLP Released: Querying the Computer Scientists Network as a Graph
Cesarini, M;Mercorio, F
;Mezzanzanica, M;
2018
Abstract
In this paper we introduce GraphDBLP, a tool that models the DBLP bibliography as a graph, and enriches the DBLP data through semantic keyword similarities computed via word-embedding. GraphDBLP has been implemented on top of the Neo4j graph-database, and it can be queried through the Cypher query language. We also provide three meaningful queries for exploring the DBLP community to (i) investigate author profiles by analysing their publication records; (ii) identify the most prolific authors on a given topic,and (iii) perform social network analyses over the whole community. GraphDBLP is available on Github. To date, it contains 5+ million nodes and 24+ million relationships, enabling users to explore the DBLP data by referencing more than 3.3 million publications, 1.7 million authors and more than 5 thousand publication venues. Thanks to the use of word-embedding, more than 7.5 thousand keywords and related similarity values were collected.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.