Stable and actionable explanations of black-box models through factual and counterfactual rules

被引：15

作者：

Guidotti, Riccardo ^{[1
]}

Monreale, Anna ^{[1
]}

Ruggieri, Salvatore ^{[1
]}

Naretto, Francesca ^{[2
]}

Turini, Franco ^{[1
]}

Pedreschi, Dino ^{[1
]}

Giannotti, Fosca ^{[2
]}

机构：

[1] Univ Pisa, Dept Comp Sci, Largo B Pontecorvo 3, I-56127 Pisa, PI, Italy

[2] Scuola Normale Super Pisa, Piazza Cavalieri 7, I-56126 Pisa, PI, Italy

来源：

DATA MINING AND KNOWLEDGE DISCOVERY | 2024年 / 38卷 / 05期

基金：

英国工程与自然科学研究理事会; 欧盟地平线“2020”; 欧洲研究理事会;

关键词：

Explainable AI; Local explanations; Model-agnostic explanations; Rule-based explanations; Counterfactuals; INSTANCE SELECTION; ALGORITHMS;

D O I：

10.1007/s10618-022-00878-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent years have witnessed the rise of accurate but obscure classification models that hide the logic of their internal decision processes. Explaining the decision taken by a black-box classifier on a specific input instance is therefore of striking interest. We propose a local rule-based model-agnostic explanation method providing stable and actionable explanations. An explanation consists of a factual logic rule, stating the reasons for the black-box decision, and a set of actionable counterfactual logic rules, proactively suggesting the changes in the instance that lead to a different outcome. Explanations are computed from a decision tree that mimics the behavior of the black-box locally to the instance to explain. The decision tree is obtained through a bagging-like approach that favors stability and fidelity: first, an ensemble of decision trees is learned from neighborhoods of the instance under investigation; then, the ensemble is merged into a single decision tree. Neighbor instances are synthetically generated through a genetic algorithm whose fitness function is driven by the black-box behavior. Experiments show that the proposed method advances the state-of-the-art towards a comprehensive approach that successfully covers stability and actionability of factual and counterfactual explanations.

引用

页码：2825 / 2862

页数：38

共 50 条

[21] Using ontologies to enhance human understandability of global post-hoc explanations of black-box models
Confalonieri, Roberto
Weyde, Tillman
Besold, Tarek R.
Martin, Fermin Moscoso del Prado
ARTIFICIAL INTELLIGENCE, 2021, 296
[22] OneMax in Black-Box Models with Several Restrictions
Carola Doerr
Johannes Lengler
Algorithmica, 2017, 78 : 610 - 640
[23] ONEMAX in Black-Box Models with Several Restrictions
Doerr, Carola
Lengler, Johannes
ALGORITHMICA, 2017, 78 (02) : 610 - 640
[24] MFPP: Morphological Fragmental Perturbation Pyramid for Black-Box Model Explanations
Yang, Qing
Zhu, Xia
Fwu, Jong-Kae
Ye, Yun
You, Ganmei
Zhu, Yuan
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1376 - 1383
[25] Iterative and Adaptive Sampling with Spatial Attention for Black-Box Model Explanations
Vasu, Bhavan
Long, Chengjiang
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 2949 - 2958
[26] Testing Framework for Black-box AI Models
Aggarwal, Aniya
Shaikh, Samiulla
Hans, Sandeep
Haldar, Swastik
Ananthanarayanan, Rema
Saha, Diptikalyan
2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 81 - 84
[27] Auditing black-box models for indirect influence
Adler, Philip
Falk, Casey
Friedler, Sorelle A.
Nix, Tionney
Rybeck, Gabriel
Scheidegger, Carlos
Smith, Brandon
Venkatasubramanian, Suresh
KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 54 (01) : 95 - 122
[28] Demystifying Black-box Models with Symbolic Metamodels
Alaa, Ahmed M.
van der Schaar, Mihaela
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[29] Auditing Black-box Models for Indirect Influence
Adler, Philip
Falk, Casey
Friedler, Sorelle A.
Rybeck, Gabriel
Scheidegger, Carlos
Smith, Brandon
Venkatasubramanian, Suresh
2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1 - 10
[30] BLACK-BOX MODELS FOR LINEAR INTEGRATED CIRCUITS
MURRAYLA.MA
IEEE TRANSACTIONS ON EDUCATION, 1969, E 12 (03) : 170 - &

← 1 2 3 4 5 →