The need for explanations of ML systems is growing as new models outperform their predecessors while becoming more complex and less comprehensible for their end-users. Though several XAI methods have been proposed in recent years, not enough attention was paid to explaining how models change their behaviour in contrast with previous ones (e.g., due to retraining). In such cases, an XAI system should explain why the model changes its predictions concerning past outcomes. Capturing and understanding such differences is crucial, as the need for trust is key in any application to support human-AI decision-making processes. This is the idea of ContrXT, a novel approach that (i) traces the decision criteria of a black box text classifier by encoding the changes in the decision logic through Binary Decision Diagrams. Then (ii) it provides global, model-agnostic, Time-Contrastive (T-contrast) explanations in natural language, estimating why – and to what extent – the model has modified its behaviour over time. We implemented and evaluated ContrXT over several supervised ML models trained on a benchmark dataset and a real-life application, showing it is effective in catching majorly changed classes and in explaining their variation through a user study. The approach has been implemented, and it is available to the community both as a python package and through REST API, providing contrastive explanations as a service.

Malandri, L., Mercorio, F., Mezzanzanica, M., Nobani, N., & Seveso, A. (2022). ContrXT: Generating contrastive explanations from any text classifier. INFORMATION FUSION, 81(May 2022), 103-115 [10.1016/j.inffus.2021.11.016].

ContrXT: Generating contrastive explanations from any text classifier

Malandri, Lorenzo;Mercorio, Fabio
;
Mezzanzanica, Mario;Nobani, Navid;Seveso, Andrea
2022

Abstract

The need for explanations of ML systems is growing as new models outperform their predecessors while becoming more complex and less comprehensible for their end-users. Though several XAI methods have been proposed in recent years, not enough attention was paid to explaining how models change their behaviour in contrast with previous ones (e.g., due to retraining). In such cases, an XAI system should explain why the model changes its predictions concerning past outcomes. Capturing and understanding such differences is crucial, as the need for trust is key in any application to support human-AI decision-making processes. This is the idea of ContrXT, a novel approach that (i) traces the decision criteria of a black box text classifier by encoding the changes in the decision logic through Binary Decision Diagrams. Then (ii) it provides global, model-agnostic, Time-Contrastive (T-contrast) explanations in natural language, estimating why – and to what extent – the model has modified its behaviour over time. We implemented and evaluated ContrXT over several supervised ML models trained on a benchmark dataset and a real-life application, showing it is effective in catching majorly changed classes and in explaining their variation through a user study. The approach has been implemented, and it is available to the community both as a python package and through REST API, providing contrastive explanations as a service.
Articolo in rivista - Articolo scientifico
Contrastive explanation methods for XAI; Post-hoc explainability; XAI interpretability of text classifiers;
English
103
115
13
Malandri, L., Mercorio, F., Mezzanzanica, M., Nobani, N., & Seveso, A. (2022). ContrXT: Generating contrastive explanations from any text classifier. INFORMATION FUSION, 81(May 2022), 103-115 [10.1016/j.inffus.2021.11.016].
Malandri, L; Mercorio, F; Mezzanzanica, M; Nobani, N; Seveso, A
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/339363
Citazioni
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
Social impact