How to Open a Black Box Classifier for Tabular Data

被引:6
|
作者
Walters, Bradley [1 ]
Ortega-Martorell, Sandra [1 ]
Olier, Ivan [1 ]
Lisboa, Paulo J. G. [1 ]
机构
[1] Liverpool John Moores Univ, Sch Comp Sci & Math, Liverpool L3 2AF, England
关键词
ANOVA; Shapley values; self-explaining neural networks; generalised additive models; interpretability;
D O I
10.3390/a16040181
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A lack of transparency in machine learning models can limit their application. We show that analysis of variance (ANOVA) methods extract interpretable predictive models from them. This is possible because ANOVA decompositions represent multivariate functions as sums of functions of fewer variables. Retaining the terms in the ANOVA summation involving functions of only one or two variables provides an efficient method to open black box classifiers. The proposed method builds generalised additive models (GAMs) by application of L1 regularised logistic regression to the component terms retained from the ANOVA decomposition of the logit function. The resulting GAMs are derived using two alternative measures, Dirac and Lebesgue. Both measures produce functions that are smooth and consistent. The term partial responses in structured models (PRiSM) describes the family of models that are derived from black box classifiers by application of ANOVA decompositions. We demonstrate their interpretability and performance for the multilayer perceptron, support vector machines and gradient-boosting machines applied to synthetic data and several real-world data sets, namely Pima Diabetes, German Credit Card, and Statlog Shuttle from the UCI repository. The GAMs are shown to be compliant with the basic principles of a formal framework for interpretability.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Mathematicians open the black box of turbulence
    Science, 1995, 269 (5229):
  • [22] Perspectives for Tabular Data Protection - How About Synthetic Data?
    Geyer, Felix
    Tent, Reinhard
    Reiffert, Michel
    Giessing, Sarah
    PRIVACY IN STATISTICAL DATABASES, PSD 2022, 2022, 13463 : 77 - 91
  • [23] AED: An black-box NLP classifier model attacker
    Liu, Yueyang
    Huang, Yan
    Cai, Zhipeng
    NEUROCOMPUTING, 2023, 550
  • [24] Second guessing a commercial 'black box' classifier by an 'in house' classifier: Serial classifier combination in a speech recognition application
    Rahman, F
    Tarnikova, Y
    Kumar, A
    Alam, H
    MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, 2004, 3077 : 374 - 383
  • [25] How to design a black and white box
    Gage, Stephen
    KYBERNETES, 2007, 36 (9-10) : 1329 - 1339
  • [26] Using sensitivity analysis and visualization techniques to open black box data mining models
    Cortez, Paulo
    Embrechts, Mark J.
    INFORMATION SCIENCES, 2013, 225 : 1 - 17
  • [27] HOW CAN WE OPEN THE BLACK BOX OF PUBLIC ADMINISTRATION? TRANSPARENCY AND ACCOUNTABILITY IN THE USE OF ALGORITHMS
    Cerrillo i Martinez, Agusti
    REVISTA CATALANA DE DRET PUBLIC, 2019, (58): : 13 - 28
  • [28] Open the black box, see what's in it
    Nielsen, Karina
    PSYCHOLOGIST, 2018, 31 : 38 - 39
  • [29] Can we open the black box of AI?
    Castelvecchi D.
    Nature, 2016, 538 (7623) : 20 - 23
  • [30] Desktop outsourcing: Pry open the black box
    Darling, CB
    DATAMATION, 1996, 42 (07): : 82 - 84