This study presents a comparative analysis of three predictive models with an increasing degree of flexibility: hidden dynamic geostatistical models (HDGM), generalised additive mixed models (GAMM), and the random forest spatiotemporal kriging models (RFSTK). These models are evaluated for their effectiveness in predicting PM 2.5 concentrations in Lombardy (North Italy) from 2016 to 2020. Despite differing methodologies, all models demonstrate proficient capture of spatiotemporal patterns within air pollution data with similar out-of-sample performance. Furthermore, the study delves into station-specific analyses, revealing variable model performance contingent on localised conditions. Model interpretation, facilitated by parametric coefficient analysis and partial dependence plots, unveils consistent associations between predictor variables and PM 2.5 concentrations. Despite nuanced variations in modelling spatiotemporal correlations, all models effectively accounted for the underlying dependence. In summary, this study underscores the efficacy of conventional techniques in modelling correlated spatiotemporal data, concurrently highlighting the complementary potential of Machine Learning and classical statistical approaches.

Otto, P., Moro, A., Rodeschini, J., Shaboviq, Q., Ignaccolo, R., Golini, N., et al. (2024). Spatiotemporal modelling of PM2.5 concentrations in Lombardy (Italy): a comparative study. ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 31(2), 245-272 [10.1007/s10651-023-00589-0].

Spatiotemporal modelling of PM2.5 concentrations in Lombardy (Italy): a comparative study

Maranzano, P;
2024

Abstract

This study presents a comparative analysis of three predictive models with an increasing degree of flexibility: hidden dynamic geostatistical models (HDGM), generalised additive mixed models (GAMM), and the random forest spatiotemporal kriging models (RFSTK). These models are evaluated for their effectiveness in predicting PM 2.5 concentrations in Lombardy (North Italy) from 2016 to 2020. Despite differing methodologies, all models demonstrate proficient capture of spatiotemporal patterns within air pollution data with similar out-of-sample performance. Furthermore, the study delves into station-specific analyses, revealing variable model performance contingent on localised conditions. Model interpretation, facilitated by parametric coefficient analysis and partial dependence plots, unveils consistent associations between predictor variables and PM 2.5 concentrations. Despite nuanced variations in modelling spatiotemporal correlations, all models effectively accounted for the underlying dependence. In summary, this study underscores the efficacy of conventional techniques in modelling correlated spatiotemporal data, concurrently highlighting the complementary potential of Machine Learning and classical statistical approaches.
Articolo in rivista - Articolo scientifico
Air pollution; Generalised additive mixed model; Geostatistics; Hidden dynamic geostatistical model; Machine learning; Random forest spatiotemporal kriging; Spatiotemporal process;
English
1-feb-2024
2024
31
2
245
272
open
Otto, P., Moro, A., Rodeschini, J., Shaboviq, Q., Ignaccolo, R., Golini, N., et al. (2024). Spatiotemporal modelling of PM2.5 concentrations in Lombardy (Italy): a comparative study. ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 31(2), 245-272 [10.1007/s10651-023-00589-0].
File in questo prodotto:
File Dimensione Formato  
Otto-2024-Environm Ecol Stat-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 3.23 MB
Formato Adobe PDF
3.23 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/458518
Citazioni
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
Social impact