Synthetic Data Generation (SDG) is expanding rapidly, yet existing surveys differ widely in scope and methodological quality. This tertiary study systematically searched four major scholarly databases (2015-2025) and, after PRISMA screening and DARE-4 appraisal,1 identified 17 eligible secondary studies. The evidence reveals a strong concentration in healthcare (58.8% of surveys), limited coverage of non-health domains, and inconsistent reporting of evaluation protocols (e.g., incomplete specification of metrics, data splits, baselines, or evaluation scripts). Fidelity and downstream utility dominate assessment practices, whereas privacy and diversity remain under-examined. Only 4 of 17 surveys provide any reproducibility artefacts. By consolidating these findings, we propose a compact, domain-agnostic evaluation baseline and highlight structural gaps in transparency, domain breadth, and methodological consistency. The study offers actionable guidance for strengthening reproducibility and broadening the evidential foundations of SDG research.
Nobani, N., Officioso, G., Pallucchini, F., Sperlì, G., Mercorio, F. (2026). Synthetic data generation: A tertiary study. INFORMATION PROCESSING & MANAGEMENT, 63(6 (September 2026)) [10.1016/j.ipm.2026.104715].
Synthetic data generation: A tertiary study
Nobani, N
Primo
;Officioso, GSecondo
;Pallucchini, F;Mercorio, FUltimo
2026
Abstract
Synthetic Data Generation (SDG) is expanding rapidly, yet existing surveys differ widely in scope and methodological quality. This tertiary study systematically searched four major scholarly databases (2015-2025) and, after PRISMA screening and DARE-4 appraisal,1 identified 17 eligible secondary studies. The evidence reveals a strong concentration in healthcare (58.8% of surveys), limited coverage of non-health domains, and inconsistent reporting of evaluation protocols (e.g., incomplete specification of metrics, data splits, baselines, or evaluation scripts). Fidelity and downstream utility dominate assessment practices, whereas privacy and diversity remain under-examined. Only 4 of 17 surveys provide any reproducibility artefacts. By consolidating these findings, we propose a compact, domain-agnostic evaluation baseline and highlight structural gaps in transparency, domain breadth, and methodological consistency. The study offers actionable guidance for strengthening reproducibility and broadening the evidential foundations of SDG research.| File | Dimensione | Formato | |
|---|---|---|---|
|
Nobani et al-2026-Information Processing & Management-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
4.34 MB
Formato
Adobe PDF
|
4.34 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


