How to Open a Black Box Classifier for Tabular Data

被引：6

作者：

Walters, Bradley ^{[1
]}

Ortega-Martorell, Sandra ^{[1
]}

Olier, Ivan ^{[1
]}

Lisboa, Paulo J. G. ^{[1
]}

机构：

[1] Liverpool John Moores Univ, Sch Comp Sci & Math, Liverpool L3 2AF, England

来源：

ALGORITHMS | 2023年 / 16卷 / 04期

关键词：

ANOVA; Shapley values; self-explaining neural networks; generalised additive models; interpretability;

D O I：

10.3390/a16040181

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A lack of transparency in machine learning models can limit their application. We show that analysis of variance (ANOVA) methods extract interpretable predictive models from them. This is possible because ANOVA decompositions represent multivariate functions as sums of functions of fewer variables. Retaining the terms in the ANOVA summation involving functions of only one or two variables provides an efficient method to open black box classifiers. The proposed method builds generalised additive models (GAMs) by application of L1 regularised logistic regression to the component terms retained from the ANOVA decomposition of the logit function. The resulting GAMs are derived using two alternative measures, Dirac and Lebesgue. Both measures produce functions that are smooth and consistent. The term partial responses in structured models (PRiSM) describes the family of models that are derived from black box classifiers by application of ANOVA decompositions. We demonstrate their interpretability and performance for the multilayer perceptron, support vector machines and gradient-boosting machines applied to synthetic data and several real-world data sets, namely Pima Diabetes, German Credit Card, and Statlog Shuttle from the UCI repository. The GAMs are shown to be compliant with the basic principles of a formal framework for interpretability.

引用

页数：26

共 50 条

[31] Enzyme's black box cracked open
David H. Sherman
Nature, 2009, 461 : 1068 - 1069
[32] A Tabular Open Data Search Engine Based on Word Embeddings for Data Integration
Berenguer, Alberto
Mazon, Jose-Norberto
Tomas, David
NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 99 - 108
[33] Detect Black Box Signals with Enhanced Spectrum and Support Vector Classifier
Lou, Chenlu
Pan, Xiang
2019 4TH INTERNATIONAL CONFERENCE ON COMMUNICATION, IMAGE AND SIGNAL PROCESSING (CCISP 2019), 2020, 1438
[34] Classifier Decoupled Training for Black-Box Unsupervised Domain Adaptation
Chen, Xiangchuang
Shen, Yunhang
Luo, Xuan
Zhang, Yan
Li, Ke
Lin, Shaohui
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 16 - 30
[35] A Differential-Evolution-Based Approach to Extract Univariate Decision Trees From Black-Box Models Using Tabular Data
Rivera-Lopez, Rafael
Ceballos, Hector G.
IEEE ACCESS, 2024, 12 : 169850 - 169868
[36] Peering at the data inside the Black Box
Oeffner, R.
McCoy, A.
Millan, C.
Croll, T.
Read, R.
ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2022, 78 : E295 - E295
[37] How to improve an exponentiation black-box
Cohen, G
Lobstein, A
Naccache, D
Zémor, G
ADVANCES IN CRYPTOLOGY - EUROCRYPT '98, 1998, 1403 : 211 - 220
[38] HOW THE PRISON IS A BLACK BOX IN PUNISHMENT THEORY
Kerr, Lisa
UNIVERSITY OF TORONTO LAW JOURNAL, 2019, 69 (01) : 85 - 116
[39] Tabular data
Naomi Altman
Martin Krzywinski
Nature Methods, 2017, 14 (4) : 329 - 330
[40] Towards a tabular open data search engine for public sector information
Berenguer, Alberto
Mazon, Jose-Norberto
Tomas, David
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5851 - 5853

← 1 2 3 4 5 →