Objective: BRCA-mutated women are recommended to undergo bilateral risk-reducing salpingo-oophorectomy (RRSO) after childbearing, due to the lack of effective methods that could be able to early detect the occurrence of ovarian cancer. Thus, predictive machine learning (ML) techniques could be crucial to aid clinicians in identifying high-risk BRCA-mutated patients and determining the appropriate timing for performing RRSO. Methods: In this work, we addressed this task by developing explainable ML models using clinical data referred to a multicentric cohort of 694 BRCA-mutated patients from six Italian centers (Policlinico Gemelli, IRCCS San Gerardo, Policlinico Bari, Istituto Tumori Regina Elena, Istituto Tumori Giovanni Paolo II, Ospedale F. Miulli), who performed salpingo-oophorectomy, out of which 39 patients showed tumor (5.6%). Data from Istituto Tumori Regina Elena and Policlinico Bari were used as External Validation Cohort (EVC). The other data were employed as Investigational Cohort (IC). Resampling and ensemble techniques were implemented to handle dataset imbalance. Explainable techniques enabled us to identify some protective and risk factors predicted by the models with respect to the task under study. Results: The best ML model achieved an AUC value of 79.3% (95% CI: 75.3% - 83.0%), an accuracy value of 73.8% (95% CI: 69.6% - 78.2%), a sensitivity value of 66.7% (95% CI: 58.1% - 75.3%), a specificity value of 74.3% (95% CI: 68.7% - 80.0%), and a G-mean value of 70.4% (95% CI: 63.0% - 76.0%) on EVC. Although the model demonstrated good overall performance, its limited sensitivity reduces its effectiveness in this high-risk population. The variables CA125, age and MatoRRSO were found to be the most significant risk factors, in agreement with the clinical perspective. Conversely, variables such as Estroprogestinuse and PregnancyNfdt played a protective factor role. Conclusion: Our ML proposal explores the intricate relationships between multiple clinical variables, with a particular emphasis on understanding their non-linear associations. However, while our approach provides valuable insights into risk assessment for BRCA-mutated patients, its current predictive capacity does not significantly improve upon existing clinical models.
Loizzi, V., Comes, M., Arezzo, F., Apostol, A., Bove, S., Fanizzi, A., et al. (2025). Validation of machine learning-based models to predict and explain the risk of ovarian cancer: a multicentric study on BRCA-mutated patients undergoing risk-reducing salpingo-oophorectomy. FRONTIERS IN ONCOLOGY, 15 [10.3389/fonc.2025.1574037].
Validation of machine learning-based models to predict and explain the risk of ovarian cancer: a multicentric study on BRCA-mutated patients undergoing risk-reducing salpingo-oophorectomy
Fruscio R.;Negri S.;
2025
Abstract
Objective: BRCA-mutated women are recommended to undergo bilateral risk-reducing salpingo-oophorectomy (RRSO) after childbearing, due to the lack of effective methods that could be able to early detect the occurrence of ovarian cancer. Thus, predictive machine learning (ML) techniques could be crucial to aid clinicians in identifying high-risk BRCA-mutated patients and determining the appropriate timing for performing RRSO. Methods: In this work, we addressed this task by developing explainable ML models using clinical data referred to a multicentric cohort of 694 BRCA-mutated patients from six Italian centers (Policlinico Gemelli, IRCCS San Gerardo, Policlinico Bari, Istituto Tumori Regina Elena, Istituto Tumori Giovanni Paolo II, Ospedale F. Miulli), who performed salpingo-oophorectomy, out of which 39 patients showed tumor (5.6%). Data from Istituto Tumori Regina Elena and Policlinico Bari were used as External Validation Cohort (EVC). The other data were employed as Investigational Cohort (IC). Resampling and ensemble techniques were implemented to handle dataset imbalance. Explainable techniques enabled us to identify some protective and risk factors predicted by the models with respect to the task under study. Results: The best ML model achieved an AUC value of 79.3% (95% CI: 75.3% - 83.0%), an accuracy value of 73.8% (95% CI: 69.6% - 78.2%), a sensitivity value of 66.7% (95% CI: 58.1% - 75.3%), a specificity value of 74.3% (95% CI: 68.7% - 80.0%), and a G-mean value of 70.4% (95% CI: 63.0% - 76.0%) on EVC. Although the model demonstrated good overall performance, its limited sensitivity reduces its effectiveness in this high-risk population. The variables CA125, age and MatoRRSO were found to be the most significant risk factors, in agreement with the clinical perspective. Conversely, variables such as Estroprogestinuse and PregnancyNfdt played a protective factor role. Conclusion: Our ML proposal explores the intricate relationships between multiple clinical variables, with a particular emphasis on understanding their non-linear associations. However, while our approach provides valuable insights into risk assessment for BRCA-mutated patients, its current predictive capacity does not significantly improve upon existing clinical models.| File | Dimensione | Formato | |
|---|---|---|---|
|
Loizzi et al-2025-Frontiers in Oncology-VoR.pdf
accesso aperto
Descrizione: This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
2.15 MB
Formato
Adobe PDF
|
2.15 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


