The complexity of human interactions with social and natural phenomena is mirrored in the way we describe our experiences through natural language. In order to retain and convey such a high dimensional information, the statistical properties of our linguistic output has to be highly correlated in time. An example are the robust observations, still largely not understood, of correlations on arbitrary long scales in literary texts. In this paper we explain how long-range correlations flow from highly structured linguistic levels down to the building blocks of a text (words, letters, etc..). By combining calculations and data analysis we show that correlations take form of a bursty sequence of events once we approach the semantically relevant topics of the text. The mechanisms we identify are fairly general and can be equally applied to other hierarchical settings.

Altmann, E., Cristadoro, G., Degli Esposti, M. (2012). On the origin of long-range correlations in texts. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 109, 11582-11587 [10.1073/pnas.1117723109].

On the origin of long-range correlations in texts

G. Cristadoro;
2012

Abstract

The complexity of human interactions with social and natural phenomena is mirrored in the way we describe our experiences through natural language. In order to retain and convey such a high dimensional information, the statistical properties of our linguistic output has to be highly correlated in time. An example are the robust observations, still largely not understood, of correlations on arbitrary long scales in literary texts. In this paper we explain how long-range correlations flow from highly structured linguistic levels down to the building blocks of a text (words, letters, etc..). By combining calculations and data analysis we show that correlations take form of a bursty sequence of events once we approach the semantically relevant topics of the text. The mechanisms we identify are fairly general and can be equally applied to other hierarchical settings.
Articolo in rivista - Articolo scientifico
Complex Systems; Language Dynamics; Long Correlations; Statistical Physics; Burstiness
English
2012
109
11582
11587
reserved
Altmann, E., Cristadoro, G., Degli Esposti, M. (2012). On the origin of long-range correlations in texts. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 109, 11582-11587 [10.1073/pnas.1117723109].
File in questo prodotto:
File Dimensione Formato  
1207.0658.pdf

Solo gestori archivio

Descrizione: post-print
Dimensione 2.31 MB
Formato Adobe PDF
2.31 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/185964
Citazioni
  • Scopus 77
  • ???jsp.display-item.citation.isi??? 73
Social impact