Explaining Neural Networks Using Attentive Knowledge Distillation

被引：4

作者：

Lee, Hyeonseok ^{[1
]}

Kim, Sungchan ^{[1
,2
]}

机构：

[1] Jeonbuk Natl Univ, Div Comp Sci & Engn, Jeonju Si 54896, Jeollabuk Do, South Korea

[2] Jeonbuk Natl Univ, Res Ctr Artificial Intelligence Technol, Jeonju Si 54896, Jeollabuk Do, South Korea

来源：

SENSORS | 2021年 / 21卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

deep neural networks; visual explanation; attention; knowledge distillation; fine-grained classification;

D O I：

10.3390/s21041280

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Explaining the prediction of deep neural networks makes the networks more understandable and trusted, leading to their use in various mission critical tasks. Recent progress in the learning capability of networks has primarily been due to the enormous number of model parameters, so that it is usually hard to interpret their operations, as opposed to classical white-box models. For this purpose, generating saliency maps is a popular approach to identify the important input features used for the model prediction. Existing explanation methods typically only use the output of the last convolution layer of the model to generate a saliency map, lacking the information included in intermediate layers. Thus, the corresponding explanations are coarse and result in limited accuracy. Although the accuracy can be improved by iteratively developing a saliency map, this is too time-consuming and is thus impractical. To address these problems, we proposed a novel approach to explain the model prediction by developing an attentive surrogate network using the knowledge distillation. The surrogate network aims to generate a fine-grained saliency map corresponding to the model prediction using meaningful regional information presented over all network layers. Experiments demonstrated that the saliency maps are the result of spatially attentive features learned from the distillation. Thus, they are useful for fine-grained classification tasks. Moreover, the proposed method runs at the rate of 24.3 frames per second, which is much faster than the existing methods by orders of magnitude.

引用

页码：1 / 17

页数：17

共 50 条

[31] Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks
Yang, Chenxiao
Wu, Qitian
Yan, Junchi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[32] Boosting Graph Neural Networks via Adaptive Knowledge Distillation
Guo, Zhichun
Zhang, Chunhui
Fan, Yujie
Tian, Yijun
Zhang, Chuxu
Chawla, Nitesh V.
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7793 - 7801
[33] Adventitious Respiratory Classification using Attentive Residual Neural Networks
Yang, Zijiang
Liu, Shuo
Song, Meishu
Parada-Cabaleiro, Emilia
Schuller, Bjoern W.
INTERSPEECH 2020, 2020, : 2912 - 2916
[34] On using neural networks models for distillation control
Munsif, HP
Riggs, JB
DISTILLATION AND ABSORPTION '97, VOLS 1 AND 2, 1997, (142): : 259 - 268
[35] Designing and Training of Lightweight Neural Networks on Edge Devices Using Early Halting in Knowledge Distillation
Mishra, Rahul
Gupta, Hari Prabhat
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 4665 - 4677
[36] Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks
Kumar, Devinder
Wong, Alexander
Taylor, Graham W.
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 1686 - 1694
[37] A Neural Attentive Model Using Human Semantic Knowledge for Clickbait Detection
Wei, Feng
Uyen Trang Nguyen
2020 IEEE INTL SYMP ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, INTL CONF ON BIG DATA & CLOUD COMPUTING, INTL SYMP SOCIAL COMPUTING & NETWORKING, INTL CONF ON SUSTAINABLE COMPUTING & COMMUNICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2020), 2020, : 770 - 776
[38] Frustratingly Easy Knowledge Distillation via Attentive Similarity Matching
Chen, Dingyao
Tan, Huibin
Lan, Long
Zhang, Xiang
Liang, Tianyi
Luo, Zhigang
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2357 - 2363
[39] Improving Neural Topic Models using Knowledge Distillation
Hoyle, Alexander
Goel, Pranav
Resnik, Philip
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1752 - 1771
[40] Extract the Knowledge of Graph Neural Networks and Go Beyond it: An Effective Knowledge Distillation Framework
Yang, Cheng
Liu, Jiawei
Shi, Chuan
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1227 - 1237

← 1 2 3 4 5 →