How to Open a Black Box Classifier for Tabular Data

被引:6
|
作者
Walters, Bradley [1 ]
Ortega-Martorell, Sandra [1 ]
Olier, Ivan [1 ]
Lisboa, Paulo J. G. [1 ]
机构
[1] Liverpool John Moores Univ, Sch Comp Sci & Math, Liverpool L3 2AF, England
关键词
ANOVA; Shapley values; self-explaining neural networks; generalised additive models; interpretability;
D O I
10.3390/a16040181
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A lack of transparency in machine learning models can limit their application. We show that analysis of variance (ANOVA) methods extract interpretable predictive models from them. This is possible because ANOVA decompositions represent multivariate functions as sums of functions of fewer variables. Retaining the terms in the ANOVA summation involving functions of only one or two variables provides an efficient method to open black box classifiers. The proposed method builds generalised additive models (GAMs) by application of L1 regularised logistic regression to the component terms retained from the ANOVA decomposition of the logit function. The resulting GAMs are derived using two alternative measures, Dirac and Lebesgue. Both measures produce functions that are smooth and consistent. The term partial responses in structured models (PRiSM) describes the family of models that are derived from black box classifiers by application of ANOVA decompositions. We demonstrate their interpretability and performance for the multilayer perceptron, support vector machines and gradient-boosting machines applied to synthetic data and several real-world data sets, namely Pima Diabetes, German Credit Card, and Statlog Shuttle from the UCI repository. The GAMs are shown to be compliant with the basic principles of a formal framework for interpretability.
引用
收藏
页数:26
相关论文
共 50 条
  • [31] Enzyme's black box cracked open
    David H. Sherman
    Nature, 2009, 461 : 1068 - 1069
  • [32] A Tabular Open Data Search Engine Based on Word Embeddings for Data Integration
    Berenguer, Alberto
    Mazon, Jose-Norberto
    Tomas, David
    NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 99 - 108
  • [33] Detect Black Box Signals with Enhanced Spectrum and Support Vector Classifier
    Lou, Chenlu
    Pan, Xiang
    2019 4TH INTERNATIONAL CONFERENCE ON COMMUNICATION, IMAGE AND SIGNAL PROCESSING (CCISP 2019), 2020, 1438
  • [34] Classifier Decoupled Training for Black-Box Unsupervised Domain Adaptation
    Chen, Xiangchuang
    Shen, Yunhang
    Luo, Xuan
    Zhang, Yan
    Li, Ke
    Lin, Shaohui
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 16 - 30
  • [35] A Differential-Evolution-Based Approach to Extract Univariate Decision Trees From Black-Box Models Using Tabular Data
    Rivera-Lopez, Rafael
    Ceballos, Hector G.
    IEEE ACCESS, 2024, 12 : 169850 - 169868
  • [36] Peering at the data inside the Black Box
    Oeffner, R.
    McCoy, A.
    Millan, C.
    Croll, T.
    Read, R.
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2022, 78 : E295 - E295
  • [37] How to improve an exponentiation black-box
    Cohen, G
    Lobstein, A
    Naccache, D
    Zémor, G
    ADVANCES IN CRYPTOLOGY - EUROCRYPT '98, 1998, 1403 : 211 - 220
  • [38] HOW THE PRISON IS A BLACK BOX IN PUNISHMENT THEORY
    Kerr, Lisa
    UNIVERSITY OF TORONTO LAW JOURNAL, 2019, 69 (01) : 85 - 116
  • [39] Tabular data
    Naomi Altman
    Martin Krzywinski
    Nature Methods, 2017, 14 (4) : 329 - 330
  • [40] Towards a tabular open data search engine for public sector information
    Berenguer, Alberto
    Mazon, Jose-Norberto
    Tomas, David
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5851 - 5853