Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

被引：0

作者：

Ardis, Paul ^{[1
]}

Flenner, Arjuna ^{[2
]}

机构：

[1] GE Aerosp Res, 1 Res Circle, Niskayuna, NY 12309 USA

[2] GE Aerosp, 3290 Patterson Ave SE, Grand Rapids, MI 49512 USA

来源：

ASSURANCE AND SECURITY FOR AI-ENABLED SYSTEMS | 2024年 / 13054卷

关键词：

D O I：

10.1117/12.3012765

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep Neural Networks (DNNs) do not inherently compute or exhibit empirically-justified task confidence. In mission critical applications, it is important to both understand associated DNN reasoning and its supporting evidence. In this paper, we propose a novel Bayesian approach to extract explanations, justifications, and uncertainty estimates from DNNs. Our approach is efficient both in terms of memory and computation, and can be applied to any black box DNN without any retraining, including applications to anomaly detection and out-of-distribution detection tasks. We validate our approach on the CIFAR-10 dataset, and show that it can significantly improve the interpretability and reliability of DNNs.

引用

页数：8

共 50 条

[1] Black-Box Testing of Deep Neural Networks
Byun, Taejoon
Rayadurgam, Sanjai
Heimdahl, Mats P. E.
2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 309 - 320
[2] Simple Black-Box Adversarial Attacks on Deep Neural Networks
Narodytska, Nina
Kasiviswanathan, Shiva
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 1310 - 1318
[3] Online Black-Box Confidence Estimation of Deep Neural Networks
Woitschek, Fabian
Schneider, Georg
2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 183 - 189
[4] Neural network laundering: Removing black-box backdoor watermarks from deep neural networks
Aiken, William
Kim, Hyoungshick
Woo, Simon
Ryoo, Jungwoo
COMPUTERS & SECURITY, 2021, 106
[5] NeuralBO: A black-box optimization algorithm using deep neural networks
Dat, Phan-Trong
Hung, Tran-The
Gupta, Sunil
NEUROCOMPUTING, 2023, 559
[6] Query efficient black-box adversarial attack on deep neural networks
Bai, Yang
Wang, Yisen
Zeng, Yuyuan
Jiang, Yong
Xia, Shu-Tao
PATTERN RECOGNITION, 2023, 133
[7] Cyclical Adversarial Attack Pierces Black-box Deep Neural Networks
Huang, Lifeng
Wei, Shuxin
Gao, Chengying
Liu, Ning
PATTERN RECOGNITION, 2022, 131
[8] Towards Lightweight Black-Box Attacks Against Deep Neural Networks
Sun, Chenghao
Zhang, Yonggang
Wan Chaoqun
Wang, Qizhou
Li, Ya
Liu, Tongliang
Han, Bo
Tian, Xinmei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[9] A Generic Framework for Black-box Explanations
Henin, Clement
Le Metayer, Daniel
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3667 - 3676
[10] Black-box Adversarial Attack against Visual Interpreters for Deep Neural Networks
Hirose, Yudai
Ono, Satoshi
2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,

← 1 2 3 4 5 →