In this paper a new graph-based model is proposed for the representation of textual documents. Graph-structures are obtained from textual documents by making use of the well-known Part-Of-Speech (POS) tagging technique. More specifically, a simple rule-based (re)classifier is used to map each tag onto graph vertices and edges. As a result, a decomposition of textual documents is obtained where tokens are automatically parsed and attached to either a vertex or an edge. It is shown how textual documents can be aggregated through their graph-structures and finally, it is shown how vertex-ranking methods can be used to find relevant tokens.1 © 2013. The authors-Published by Atlantis Press.
Bronselaer, A., Pasi, G. (2013). An approach to graph-based analysis of textual documents. In 8th Conference of the European Society for Fuzzy Logic and Technology, EUSFLAT 2013 - Advances in Intelligent Systems Research (pp.634-641) [10.2991/eusflat.2013.96].
An approach to graph-based analysis of textual documents
PASI, GABRIELLAUltimo
2013
Abstract
In this paper a new graph-based model is proposed for the representation of textual documents. Graph-structures are obtained from textual documents by making use of the well-known Part-Of-Speech (POS) tagging technique. More specifically, a simple rule-based (re)classifier is used to map each tag onto graph vertices and edges. As a result, a decomposition of textual documents is obtained where tokens are automatically parsed and attached to either a vertex or an edge. It is shown how textual documents can be aggregated through their graph-structures and finally, it is shown how vertex-ranking methods can be used to find relevant tokens.1 © 2013. The authors-Published by Atlantis Press.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.