We present the SchoolDataITR library, which provides an overview on the current status of the Italian educational system by gathering relevant open data on school infrastructure through web scraping and harmonises them into an organic database. In addition to infrastructural information, the software retrieves the results of the Invalsi census survey, which is typically considered a thorough indicator of education quality nationwide. The package is composed of four main groups of functions. The first group retrieves the inputs from the source web pages; the second one is employed for basic data editing; the third one aggregates the data at a given territorial level, either municipalities (LAU) or provinces (NUTS-3); lastly, mapping functions are included to render the final datasets through static or interactive maps. We show the potential application of the software by providing a practical example that highlights the importance of spatial statistics to model data about the educational system at the territorial level. Indeed, territorial disparities can be found across several dimensions of both infrastructure endowment and education quality, representing a significant challenge to territorial sustainability.

Cefalo, L., Maranzano, P. (2025). A comprehensive analysis of the Italian school system using harmonised open data via the SchoolDataIT R package. STATISTICAL METHODS & APPLICATIONS, 34(4), 815-839 [10.1007/s10260-025-00805-0].

A comprehensive analysis of the Italian school system using harmonised open data via the SchoolDataIT R package

Maranzano, Paolo
Secondo
2025

Abstract

We present the SchoolDataITR library, which provides an overview on the current status of the Italian educational system by gathering relevant open data on school infrastructure through web scraping and harmonises them into an organic database. In addition to infrastructural information, the software retrieves the results of the Invalsi census survey, which is typically considered a thorough indicator of education quality nationwide. The package is composed of four main groups of functions. The first group retrieves the inputs from the source web pages; the second one is employed for basic data editing; the third one aggregates the data at a given territorial level, either municipalities (LAU) or provinces (NUTS-3); lastly, mapping functions are included to render the final datasets through static or interactive maps. We show the potential application of the software by providing a practical example that highlights the importance of spatial statistics to model data about the educational system at the territorial level. Indeed, territorial disparities can be found across several dimensions of both infrastructure endowment and education quality, representing a significant challenge to territorial sustainability.
Articolo in rivista - Articolo scientifico
Italian school system; R programming language; School infrastructure; SchoolDataIT library; Web scraping;
English
7-ago-2025
2025
34
4
815
839
open
Cefalo, L., Maranzano, P. (2025). A comprehensive analysis of the Italian school system using harmonised open data via the SchoolDataIT R package. STATISTICAL METHODS & APPLICATIONS, 34(4), 815-839 [10.1007/s10260-025-00805-0].
File in questo prodotto:
File Dimensione Formato  
Cefalo-Maranzano-2025- Stat Methods Appl-VoR.pdf

accesso aperto

Descrizione: This article is licensed under a Creative Commons Attribution 4.0 International License
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 6.78 MB
Formato Adobe PDF
6.78 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/563661
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact