The progressive deployment of ICT technologies in the courtroom is leading to the development of integrated multimedia folders where the entire trial contents (documents, audio and video recordings) are available for online consultation via web-based platforms. The current amount of unstructured textual data available into the judicial domain, especially related to hearing transcriptions, highlights therefore the need to automatically extract structured data from the unstructured ones for improving the efficiency of consultation processes. In this paper we address the problem of extracting structured information from the transcriptions generated automatically using an ASR (Automatic Speech Recognition) system, by integrating Conditional Random Fields with available background information. The computational experiments show promising results in structuring ASR outputs, enabling a robust and efficient document consultation.
Fersini, E., Messina, V. (2013). Named Entities in Judicial Transcriptions: Extended Conditional Random Fields. In Proceedings of Computational Linguistics and Intelligent Text Processing. 14th International Conference, CICLing 2013, Samos; Greece; 24-30 March 2013 (pp.317-328). Springer [10.1007/978-3-642-37247-6_26].
Named Entities in Judicial Transcriptions: Extended Conditional Random Fields
FERSINI, ELISABETTA;MESSINA, VINCENZINA
2013
Abstract
The progressive deployment of ICT technologies in the courtroom is leading to the development of integrated multimedia folders where the entire trial contents (documents, audio and video recordings) are available for online consultation via web-based platforms. The current amount of unstructured textual data available into the judicial domain, especially related to hearing transcriptions, highlights therefore the need to automatically extract structured data from the unstructured ones for improving the efficiency of consultation processes. In this paper we address the problem of extracting structured information from the transcriptions generated automatically using an ASR (Automatic Speech Recognition) system, by integrating Conditional Random Fields with available background information. The computational experiments show promising results in structuring ASR outputs, enabling a robust and efficient document consultation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.