We present the SchoolDataITR library, which provides an overview on the current status of the Italian educational system by gathering relevant open data on school infrastructure through web scraping and harmonises them into an organic database. In addition to infrastructural information, the software retrieves the results of the Invalsi census survey, which is typically considered a thorough indicator of education quality nationwide. The package is composed of four main groups of functions. The first group retrieves the inputs from the source web pages; the second one is employed for basic data editing; the third one aggregates the data at a given territorial level, either municipalities (LAU) or provinces (NUTS-3); lastly, mapping functions are included to render the final datasets through static or interactive maps. We show the potential application of the software by providing a practical example that highlights the importance of spatial statistics to model data about the educational system at the territorial level. Indeed, territorial disparities can be found across several dimensions of both infrastructure endowment and education quality, representing a significant challenge to territorial sustainability.
Cefalo, L., Maranzano, P. (2025). A comprehensive analysis of the Italian school system using harmonised open data via the SchoolDataIT R package. STATISTICAL METHODS & APPLICATIONS, 34(4), 815-839 [10.1007/s10260-025-00805-0].
A comprehensive analysis of the Italian school system using harmonised open data via the SchoolDataIT R package
Maranzano, PaoloSecondo
2025
Abstract
We present the SchoolDataITR library, which provides an overview on the current status of the Italian educational system by gathering relevant open data on school infrastructure through web scraping and harmonises them into an organic database. In addition to infrastructural information, the software retrieves the results of the Invalsi census survey, which is typically considered a thorough indicator of education quality nationwide. The package is composed of four main groups of functions. The first group retrieves the inputs from the source web pages; the second one is employed for basic data editing; the third one aggregates the data at a given territorial level, either municipalities (LAU) or provinces (NUTS-3); lastly, mapping functions are included to render the final datasets through static or interactive maps. We show the potential application of the software by providing a practical example that highlights the importance of spatial statistics to model data about the educational system at the territorial level. Indeed, territorial disparities can be found across several dimensions of both infrastructure endowment and education quality, representing a significant challenge to territorial sustainability.| File | Dimensione | Formato | |
|---|---|---|---|
|
Cefalo-Maranzano-2025- Stat Methods Appl-VoR.pdf
accesso aperto
Descrizione: This article is licensed under a Creative Commons Attribution 4.0 International License
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
6.78 MB
Formato
Adobe PDF
|
6.78 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


