IMF: Integrating Matched Features Using Attentive Logit in Knowledge Distillation

被引:0
|
作者
Kim, Jeongho [1 ]
Lee, Hanbeen [2 ]
Woo, Simon S. [3 ]
机构
[1] Korea Adv Inst Sci & Technol, Korea Adv Inst Sci & Technol, Daejeon, South Korea
[2] NAVER Z Corp, Seongnam, South Korea
[3] Sungkyunkwan Univ, Dept Artificial Intelligence, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation (KD) is an effective method for transferring the knowledge of a teacher model to a student model, that aims to improve the latter's performance efficiently. Although generic knowledge distillation methods such as softmax representation distillation and intermediate feature matching have demonstrated improvements with various tasks, only marginal improvements are shown in student networks due to their limited model capacity. In this work, to address the student model's limitation, we propose a novel flexible KD framework, Integrating Matched Features using Attentive Logit in Knowledge Distillation (IMF). Our approach introduces an intermediate feature distiller (IFD) to improve the overall performance of the student model by directly distilling the teacher's knowledge into branches of student models. The generated output of IFD, which is trained by the teacher model, is effectively combined by attentive logit. We use only a few blocks of the student and the trained IFD during inference, requiring an equal or less number of parameters. Through extensive experiments, we demonstrate that IMF consistently outperforms other state-of-the-art methods with a large margin over the various datasets in different tasks without extra computation.
引用
收藏
页码:974 / +
页数:10
相关论文
共 50 条
  • [1] Explaining Neural Networks Using Attentive Knowledge Distillation
    Lee, Hyeonseok
    Kim, Sungchan
    SENSORS, 2021, 21 (04) : 1 - 17
  • [2] Video Summarization Using Knowledge Distillation-Based Attentive Network
    Qin, Jialin
    Yu, Hui
    Liang, Wei
    Ding, Derui
    COGNITIVE COMPUTATION, 2024, 16 (03) : 1022 - 1031
  • [3] Neural Compatibility Modeling with Attentive Knowledge Distillation
    Song, Xuemeng
    Feng, Fuli
    Han, Xianjing
    Yang, Xin
    Liu, Wei
    Nie, Liqiang
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 5 - 14
  • [4] Leveraging logit uncertainty for better knowledge distillation
    Guo, Zhen
    Wang, Dong
    He, Qiang
    Zhang, Pengzhou
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [5] DistillGrasp: Integrating Features Correlation With Knowledge Distillation for Depth Completion of Transparent Objects
    Huang, Yiheng
    Chen, Junhong
    Michiels, Nick
    Asim, Muhammad
    Claesen, Luc
    Liu, Wenyin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8945 - 8952
  • [6] Frustratingly Easy Knowledge Distillation via Attentive Similarity Matching
    Chen, Dingyao
    Tan, Huibin
    Lan, Long
    Zhang, Xiang
    Liang, Tianyi
    Luo, Zhigang
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2357 - 2363
  • [7] KAT: knowledge-aware attentive recommendation model integrating two-terminal neighbor features
    Liu, Tianqi
    Zhang, Xinxin
    Wang, Wenzheng
    Mu, Weisong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (11) : 4941 - 4958
  • [8] KNOWLEDGE DISTILLATION WITH CATEGORY-AWARE ATTENTION AND DISCRIMINANT LOGIT LOSSES
    Jiang, Lei
    Zhou, Wengang
    Li, Houqiang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1792 - 1797
  • [9] Convolution Attentive Knowledge Tracing with comprehensive behavioral features
    Xing, Jiaqi
    Li, Kaixuan
    Wu, Yuheng
    Gao, Zhizezhang
    Liu, Xingyu
    Sun, Xia
    Feng, Jun
    PROCEEDINGS OF THE ACM TURING AWARD CELEBRATION CONFERENCE-CHINA 2024, ACM-TURC 2024, 2024, : 48 - 52
  • [10] Why logit distillation works: A novel knowledge distillation technique by deriving target augmentation and logits distortion
    Hossain, Md Imtiaz
    Akhter, Sharmen
    Mahbub, Nosin Ibna
    Hong, Choong Seon
    Huh, Eui-Nam
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)