Self-knowledge distillation based on knowledge transfer from soft to hard examples

被引:4
|
作者
Tang, Yuan [1 ]
Chen, Ying [1 ]
Xie, Linbo [2 ]
机构
[1] Jiangnan Univ, Minist Educ, Key Lab Adv Proc Control Light Ind, Wuxi 214122, Peoples R China
[2] Jiangnan Univ, Minist Educ, Engn Res Ctr Internet Things Technol Applicat, Wuxi 214122, Peoples R China
关键词
Model compression; Self-knowledge distillation; Hard examples; Class probability consistency; Memory bank;
D O I
10.1016/j.imavis.2023.104700
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To fully exploit knowledge from self-knowledge distillation network in which a student model is progressively trained to distill its own knowledge without a pre-trained teacher model, a self-knowledge distillation method based on knowledge transfer from soft to hard examples is proposed. A knowledge transfer module is designed to exploit the dark knowledge of hard examples, which can force the class probability consistency between hard and soft examples. It reduces the confidence of wrong prediction by transferring the class information from soft probability distributions of auxiliary self-teacher network to classifier network (self-student network). Further-more, a dynamic memory bank for softened probability distribution is introduced, whose updating strategy is also presented. Experiments show the method improves the accuracy by 0.64% on classification datasets in aver-age and by 3.87% on fine-grained visual recognition tasks in average, which makes its performance superior to the state-of-the-arts.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels
    Yang, Zhendong
    Zeng, Ailing
    Li, Zhe
    Zhang, Tianke
    Yuan, Chun
    Li, Yu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17139 - 17148
  • [2] Self-knowledge distillation with dimensional history knowledge
    Wenke Huang
    Mang Ye
    Zekun Shi
    He Li
    Bo Du
    Science China Information Sciences, 2025, 68 (9)
  • [3] Neighbor self-knowledge distillation
    Liang, Peng
    Zhang, Weiwei
    Wang, Junhuang
    Guo, Yufeng
    INFORMATION SCIENCES, 2024, 654
  • [4] Two-Stage Approach for Targeted Knowledge Transfer in Self-Knowledge Distillation
    Yin, Zimo
    Pu, Jian
    Zhou, Yijie
    Xue, Xiangyang
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (11) : 2270 - 2283
  • [5] Two-Stage Approach for Targeted Knowledge Transfer in Self-Knowledge Distillation
    Zimo Yin
    Jian Pu
    Yijie Zhou
    Xiangyang Xue
    IEEE/CAA Journal of Automatica Sinica, 2024, 11 (11) : 2270 - 2283
  • [6] Self-knowledge distillation based on dynamic mixed attention
    Tang, Yuan
    Chen, Ying
    Kongzhi yu Juece/Control and Decision, 2024, 39 (12): : 4099 - 4108
  • [7] Self-knowledge distillation via dropout
    Lee, Hyoje
    Park, Yeachan
    Seo, Hyun
    Kang, Myungjoo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 233
  • [8] Dual teachers for self-knowledge distillation
    Li, Zheng
    Li, Xiang
    Yang, Lingfeng
    Song, Renjie
    Yang, Jian
    Pan, Zhigeng
    PATTERN RECOGNITION, 2024, 151
  • [9] Sliding Cross Entropy for Self-Knowledge Distillation
    Lee, Hanbeen
    Kim, Jeongho
    Woo, Simon S.
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1044 - 1053
  • [10] Self-Knowledge Distillation with Progressive Refinement of Targets
    Kim, Kyungyul
    Ji, ByeongMoon
    Yoon, Doyoung
    Hwang, Sangheum
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6547 - 6556