In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture ‘standard’ scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others. TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision.

Osborne, F., de Ribaupierre, H., Motta, E. (2016). TechMiner: Extracting technologies from academic publications. In 20th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2016 (pp.463-479). Springer Verlag [10.1007/978-3-319-49004-5_30].

TechMiner: Extracting technologies from academic publications

Osborne F
;
2016

Abstract

In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture ‘standard’ scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others. TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision.
Si
paper
Bibliographic data; Data mining; Ontology learning; Scholarly data; Scholarly ontologies;
English
20th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2016 - 19 November 2016 through 23 November 2016
978-3-319-49003-8
https://link.springer.com/chapter/10.1007/978-3-319-49004-5_30
Osborne, F., de Ribaupierre, H., Motta, E. (2016). TechMiner: Extracting technologies from academic publications. In 20th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2016 (pp.463-479). Springer Verlag [10.1007/978-3-319-49004-5_30].
Osborne, F; de Ribaupierre, H; Motta, E
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/381543
Citazioni
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 5
Social impact