Approaches of high-level data fusion, also known as consensus, combine predictions of individual models to increase reliability and overcome limitations of single models. Consensus strategies are frequently applied in the framework of Quantitative Structure - Activity Relationships (QSARs) to reduce the uncertainties in the prediction of molecular activities and provide better accuracy of the model outcomes. However, specific regions of the chemical space may systematically be associated with low accuracy and even consensus modelling cannot improve prediction reliability through the multiple outcomes of individual models. In this study, a new heuristic metric to assess the degree of accuracy of consensus predictions in the chemical space is proposed. This metric can assist the mapping of reliability in prediction and enhance the delineation of a safe zone, where consensus predictions are expected to have better accuracy. The new metric is calculated by kernel-based potential functions and it can be used in the framework of both classification and regression consensus modelling. Four case studies, including extensive datasets for consensus modelling, were used to test the proposed approach. Results demonstrated that a potential can be associated with regions of the chemical space as a function of accuracy of consensus modelling and it can be used to enable the mapping of reliability in prediction and the definition of specific regions where predictions are expected to be more reliable.
Consonni, V., Todeschini, R., Orlandi, M., Ballabio, D. (2024). Kernel-based mapping of reliability in predictions for consensus modelling. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 246(15 March 2024) [10.1016/j.chemolab.2024.105085].
Kernel-based mapping of reliability in predictions for consensus modelling
Consonni, Viviana;Todeschini, Roberto;Orlandi, Marco;Ballabio, Davide
2024
Abstract
Approaches of high-level data fusion, also known as consensus, combine predictions of individual models to increase reliability and overcome limitations of single models. Consensus strategies are frequently applied in the framework of Quantitative Structure - Activity Relationships (QSARs) to reduce the uncertainties in the prediction of molecular activities and provide better accuracy of the model outcomes. However, specific regions of the chemical space may systematically be associated with low accuracy and even consensus modelling cannot improve prediction reliability through the multiple outcomes of individual models. In this study, a new heuristic metric to assess the degree of accuracy of consensus predictions in the chemical space is proposed. This metric can assist the mapping of reliability in prediction and enhance the delineation of a safe zone, where consensus predictions are expected to have better accuracy. The new metric is calculated by kernel-based potential functions and it can be used in the framework of both classification and regression consensus modelling. Four case studies, including extensive datasets for consensus modelling, were used to test the proposed approach. Results demonstrated that a potential can be associated with regions of the chemical space as a function of accuracy of consensus modelling and it can be used to enable the mapping of reliability in prediction and the definition of specific regions where predictions are expected to be more reliable.File | Dimensione | Formato | |
---|---|---|---|
Consonni-2024-Chemometrics Intell Laborat Sys-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
8.56 MB
Formato
Adobe PDF
|
8.56 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.