Stable and actionable explanations of black-box models through factual and counterfactual rules

被引：15

作者：

Guidotti, Riccardo ^{[1
]}

Monreale, Anna ^{[1
]}

Ruggieri, Salvatore ^{[1
]}

Naretto, Francesca ^{[2
]}

Turini, Franco ^{[1
]}

Pedreschi, Dino ^{[1
]}

Giannotti, Fosca ^{[2
]}

机构：

[1] Univ Pisa, Dept Comp Sci, Largo B Pontecorvo 3, I-56127 Pisa, PI, Italy

[2] Scuola Normale Super Pisa, Piazza Cavalieri 7, I-56126 Pisa, PI, Italy

来源：

DATA MINING AND KNOWLEDGE DISCOVERY | 2024年 / 38卷 / 05期

基金：

英国工程与自然科学研究理事会; 欧盟地平线“2020”; 欧洲研究理事会;

关键词：

Explainable AI; Local explanations; Model-agnostic explanations; Rule-based explanations; Counterfactuals; INSTANCE SELECTION; ALGORITHMS;

D O I：

10.1007/s10618-022-00878-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent years have witnessed the rise of accurate but obscure classification models that hide the logic of their internal decision processes. Explaining the decision taken by a black-box classifier on a specific input instance is therefore of striking interest. We propose a local rule-based model-agnostic explanation method providing stable and actionable explanations. An explanation consists of a factual logic rule, stating the reasons for the black-box decision, and a set of actionable counterfactual logic rules, proactively suggesting the changes in the instance that lead to a different outcome. Explanations are computed from a decision tree that mimics the behavior of the black-box locally to the instance to explain. The decision tree is obtained through a bagging-like approach that favors stability and fidelity: first, an ensemble of decision trees is learned from neighborhoods of the instance under investigation; then, the ensemble is merged into a single decision tree. Neighbor instances are synthetically generated through a genetic algorithm whose fitness function is driven by the black-box behavior. Experiments show that the proposed method advances the state-of-the-art towards a comprehensive approach that successfully covers stability and actionability of factual and counterfactual explanations.

引用

页码：2825 / 2862

页数：38

共 50 条

[31] Regularizing Black-box Models for Improved Interpretability
Plumb, Gregory
Al-Shedivat, Maruan
Cabrera, Angel Alexander
Perer, Adam
Xing, Eric
Talwalkar, Ameet
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[32] Explaining Black-box Classification Models with Arguments
Amgoud, Leila
2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 791 - 795
[33] Adversarial Eigen Attack on Black-Box Models
Zhou, Linjun
Cui, Peng
Zhang, Xingxuan
Jiang, Yinan
Yang, Shiqiang
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15233 - 15241
[34] Auditing black-box models for indirect influence
Philip Adler
Casey Falk
Sorelle A. Friedler
Tionney Nix
Gabriel Rybeck
Carlos Scheidegger
Brandon Smith
Suresh Venkatasubramanian
Knowledge and Information Systems, 2018, 54 : 95 - 122
[35] Black-box models for reference voltage monitoring
Serbec, IN
Fefer, D
PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON APPLIED SIMULATION AND MODELLING, 2004, : 533 - 539
[36] Towards describing black-box testing methods as atomic rules
Murnane, T
Hall, R
Reed, K
PROCEEDINGS OF THE 29TH ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, 2005, : 437 - 442
[37] CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models
Ormazabal, Aitor
Artetxe, Mikel
Agirre, Eneko
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2961 - 2974
[38] Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure
Novello, Paul
Fel, Thomas
Vigouroux, David
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[39] Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks
Ardis, Paul
Flenner, Arjuna
ASSURANCE AND SECURITY FOR AI-ENABLED SYSTEMS, 2024, 13054
[40] Can Explanations Be Useful for Calibrating Black Box Models?
Ye, Xi
Durrett, Greg
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6199 - 6212

← 1 2 3 4 5 →