Explaining Neural Networks Using Attentive Knowledge Distillation

被引：4

作者：

Lee, Hyeonseok ^{[1
]}

Kim, Sungchan ^{[1
,2
]}

机构：

[1] Jeonbuk Natl Univ, Div Comp Sci & Engn, Jeonju Si 54896, Jeollabuk Do, South Korea

[2] Jeonbuk Natl Univ, Res Ctr Artificial Intelligence Technol, Jeonju Si 54896, Jeollabuk Do, South Korea

来源：

SENSORS | 2021年 / 21卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

deep neural networks; visual explanation; attention; knowledge distillation; fine-grained classification;

D O I：

10.3390/s21041280

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Explaining the prediction of deep neural networks makes the networks more understandable and trusted, leading to their use in various mission critical tasks. Recent progress in the learning capability of networks has primarily been due to the enormous number of model parameters, so that it is usually hard to interpret their operations, as opposed to classical white-box models. For this purpose, generating saliency maps is a popular approach to identify the important input features used for the model prediction. Existing explanation methods typically only use the output of the last convolution layer of the model to generate a saliency map, lacking the information included in intermediate layers. Thus, the corresponding explanations are coarse and result in limited accuracy. Although the accuracy can be improved by iteratively developing a saliency map, this is too time-consuming and is thus impractical. To address these problems, we proposed a novel approach to explain the model prediction by developing an attentive surrogate network using the knowledge distillation. The surrogate network aims to generate a fine-grained saliency map corresponding to the model prediction using meaningful regional information presented over all network layers. Experiments demonstrated that the saliency maps are the result of spatially attentive features learned from the distillation. Thus, they are useful for fine-grained classification tasks. Moreover, the proposed method runs at the rate of 24.3 frames per second, which is much faster than the existing methods by orders of magnitude.

引用

页码：1 / 17

页数：17

共 50 条

[1] Neural Compatibility Modeling with Attentive Knowledge Distillation
Song, Xuemeng
Feng, Fuli
Han, Xianjing
Yang, Xin
Liu, Wei
Nie, Liqiang
ACM/SIGIR PROCEEDINGS 2018, 2018, : 5 - 14
[2] Channel Planting for Deep Neural Networks using Knowledge Distillation
Mitsuno, Kakeru
Nomura, Yuichiro
Kurita, Takio
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7573 - 7579
[3] Explaining Knowledge Distillation by Quantifying the Knowledge
Cheng, Xu
Rao, Zhefan
Chen, Yilan
Zhang, Quanshi
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 12922 - 12932
[4] Knowledge distillation on neural networks for evolving graphs
Antaris, Stefanos
Rafailidis, Dimitrios
Girdzijauskas, Sarunas
SOCIAL NETWORK ANALYSIS AND MINING, 2021, 11 (01)
[5] On Representation Knowledge Distillation for Graph Neural Networks
Joshi, Chaitanya K.
Liu, Fayao
Xun, Xu
Lin, Jie
Foo, Chuan Sheng
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4656 - 4667
[6] Knowledge distillation on neural networks for evolving graphs
Stefanos Antaris
Dimitrios Rafailidis
Sarunas Girdzijauskas
Social Network Analysis and Mining, 2021, 11
[7] Video Summarization Using Knowledge Distillation-Based Attentive Network
Qin, Jialin
Yu, Hui
Liang, Wei
Ding, Derui
COGNITIVE COMPUTATION, 2024, 16 (03) : 1022 - 1031
[8] IMF: Integrating Matched Features Using Attentive Logit in Knowledge Distillation
Kim, Jeongho
Lee, Hanbeen
Woo, Simon S.
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 974 - +
[9] RELIANT: Fair Knowledge Distillation for Graph Neural Networks
Dong, Yushun
Zhang, Binchi
Yuan, Yiling
Zou, Na
Wang, Qi
Li, Jundong
PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 154 - +
[10] Adaptively Denoising Graph Neural Networks for Knowledge Distillation
Guo, Yuxin
Yang, Cheng
Shi, Chuan
Tu, Ke
Wu, Zhengwei
Zhang, Zhiqiang
Zhou, Jun
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-RESEARCH TRACK AND DEMO TRACK, PT VIII, ECML PKDD 2024, 2024, 14948 : 253 - 269

← 1 2 3 4 5 →