The Scolinter research project investigates the writing proficiency of primary school students in France, Italy, and Spain. The eponymous corpus consists of a large longitudinal corpus of comparable children’s writing in these three languages (Ponton et al., 2021). The aim of this project is to create a NLP tool tailored to this corpus to assist researchers in the linguistic description of phenomena relating to cohesion and textual coherence, in particular referential continuity. To meet these goals, the first stage of this research consists in designing a model and choosing an annotation format meeting these goals. This paper reviews the research on anaphora, coreference and referential continuity, presents the specific features of the Scolinter corpus, suggesting methodological pathways for further work.

Barletta, M. (2024). Annotation de la continuité référentielle dans un corpus scolaire – premiers résultats. In 35èmes Journées d'Études sur la Parole (JEP) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL) (pp.28-41).

Annotation de la continuité référentielle dans un corpus scolaire – premiers résultats

Barletta, M.
Primo
2024

Abstract

The Scolinter research project investigates the writing proficiency of primary school students in France, Italy, and Spain. The eponymous corpus consists of a large longitudinal corpus of comparable children’s writing in these three languages (Ponton et al., 2021). The aim of this project is to create a NLP tool tailored to this corpus to assist researchers in the linguistic description of phenomena relating to cohesion and textual coherence, in particular referential continuity. To meet these goals, the first stage of this research consists in designing a model and choosing an annotation format meeting these goals. This paper reviews the research on anaphora, coreference and referential continuity, presents the specific features of the Scolinter corpus, suggesting methodological pathways for further work.
slide + paper
children’s corpora, NLP, referential continuity, annotated corpora
corpus scolaires, TAL, continuité référentielle, annotation de corpus
French
35èmes Journées d'Études sur la Parole (JEP) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL)
2024
Balaguer, M.; Bendahman, N.; Ho-Dac, L.-M.; Mauclair, J.; Moreno, J. G.; Pinquier, J.
35èmes Journées d'Études sur la Parole (JEP) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL)
2024
28
41
https://inria.hal.science/hal-04622985
open
Barletta, M. (2024). Annotation de la continuité référentielle dans un corpus scolaire – premiers résultats. In 35èmes Journées d'Études sur la Parole (JEP) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL) (pp.28-41).
File in questo prodotto:
File Dimensione Formato  
Barletta-2024-TALN JEP RECITAL-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 363.95 kB
Formato Adobe PDF
363.95 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/520139
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact