Rare diseases pose unique methodological challenges for causal questions, where data scarcity is not incidental but inherent. We present a case study on the external validation of a causal Bayesian network (BN) developed to investigate cardiovascular diseases (CVDs) in adolescent and young adult female breast cancer survivors. Despite strong internal performance, external validation across multiple Italian regions revealed scenarios where the model’s predictions collapsed into complete uncertainty. These included: (i) circumstances in which shifts in missingness and selection mechanisms made the original model parameters inappropriate (ii) unobserved infrequent treatments in training data; and (iii) novel treatment combinations emerging from evolving clinical protocols. We discuss the implications of these findings, emphasizing the need for including informative priors, making timely model updates, and explicit uncertainty quantification. Our experience underscores that in rare disease research, robust causal modeling demands critical scrutiny, adaptive learning, and the indispensable role of human revision. These lessons could be broadly applicable to other domains where data are scarce, emphasizing that methodological rigor and adaptability are as crucial as the models themselves.
Bernasconi, A., Pirola, F., Zanga, A., Balordi, A. (2026). When ‘Scarce’ Becomes ‘Too Scarce’: The Rare Diseases Drama. In Artificial Intelligence for Healthcare, and Hybrid Models for Coupling Deductive and Inductive Reasoning First International Joint Conference, HC@AIxIA+HYDRA 2025, Bologna, Italy, October 25–26, 2025, Proceedings (pp.126-131). Springer Cham [10.1007/978-3-032-16708-8_11].
When ‘Scarce’ Becomes ‘Too Scarce’: The Rare Diseases Drama
Bernasconi, Alice;Pirola, Federico;Zanga, Alessio;
2026
Abstract
Rare diseases pose unique methodological challenges for causal questions, where data scarcity is not incidental but inherent. We present a case study on the external validation of a causal Bayesian network (BN) developed to investigate cardiovascular diseases (CVDs) in adolescent and young adult female breast cancer survivors. Despite strong internal performance, external validation across multiple Italian regions revealed scenarios where the model’s predictions collapsed into complete uncertainty. These included: (i) circumstances in which shifts in missingness and selection mechanisms made the original model parameters inappropriate (ii) unobserved infrequent treatments in training data; and (iii) novel treatment combinations emerging from evolving clinical protocols. We discuss the implications of these findings, emphasizing the need for including informative priors, making timely model updates, and explicit uncertainty quantification. Our experience underscores that in rare disease research, robust causal modeling demands critical scrutiny, adaptive learning, and the indispensable role of human revision. These lessons could be broadly applicable to other domains where data are scarce, emphasizing that methodological rigor and adaptability are as crucial as the models themselves.| File | Dimensione | Formato | |
|---|---|---|---|
|
Bernasconi et al-2026-HC@AIxIA+HYDRA-VoR.pdf
Solo gestori archivio
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Tutti i diritti riservati
Dimensione
942.31 kB
Formato
Adobe PDF
|
942.31 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


