Decoupled Multi-teacher Knowledge Distillation based on Entropy

被引:0
|
作者
Cheng, Xin [1 ]
Tang, Jialiang [2 ]
Zhang, Zhiqiang [3 ]
Yu, Wenxin [3 ]
Jiang, Ning [3 ]
Zhou, Jinjia [1 ]
机构
[1] Hosei Univ, Grad Sch Sci & Engn, Tokyo, Japan
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[3] Southwest Univ Sci & Technol, Sch Comp Sci & Technol, Mianyang, Sichuan, Peoples R China
关键词
Multi-teacher knowledge distillation; image classification; entropy; deep learning;
D O I
10.1109/ISCAS58744.2024.10558141
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multi-teacher knowledge distillation (MKD) aims to leverage the valuable and diverse knowledge presented by multiple teacher networks to improve the performance of the student network. Existing approaches typically rely on simple methods such as averaging the prediction logits or using sub-optimal weighting strategies to combine knowledge from multiple teachers. However, employing these techniques cannot fully reflect the importance of teachers and may even mislead student's learning. To address these issues, we propose a novel Decoupled Multi teacher Knowledge Distillation based on Entropy (DE-MKD). DE-MKD decomposes the vanilla KD loss and assigns weights to each teacher to reflect its importance based on the entropy of their predictions. Furthermore, we extend the proposed approach to distill the intermediate features from teachers to further improve the performance of the student network. Extensive experiments conducted on the publicly available CIFAR-100 image classification dataset demonstrate the effectiveness and flexibility of our proposed approach.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Named Entity Recognition Method Based on Multi-Teacher Collaborative Cyclical Knowledge Distillation
    Jin, Chunqiao
    Yang, Shuangyuan
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 230 - 235
  • [22] MTKDSR: Multi-Teacher Knowledge Distillation for Super Resolution Image Reconstruction
    Yao, Gengqi
    Li, Zhan
    Bhanu, Bir
    Kang, Zhiqing
    Zhong, Ziyi
    Zhang, Qingfeng
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 352 - 358
  • [23] MulDE: Multi-teacher Knowledge Distillation for Low-dimensional Knowledge Graph Embeddings
    Wang, Kai
    Liu, Yu
    Ma, Qian
    Sheng, Quan Z.
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1716 - 1726
  • [24] Continual Learning with Confidence-based Multi-teacher Knowledge Distillation for Neural Machine Translation
    Guo, Jiahua
    Liang, Yunlong
    Xu, Jinan
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 336 - 343
  • [25] Cross-View Gait Recognition Method Based on Multi-Teacher Joint Knowledge Distillation
    Li, Ruoyu
    Yun, Lijun
    Zhang, Mingxuan
    Yang, Yanchen
    Cheng, Feiyan
    SENSORS, 2023, 23 (22)
  • [26] CIMTD: Class Incremental Multi-Teacher Knowledge Distillation for Fractal Object Detection
    Wu, Chuhan
    Luo, Xiaochuan
    Huang, Haoran
    Zhang, Yulin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XII, 2025, 15042 : 51 - 65
  • [27] A Multi-Teacher Assisted Knowledge Distillation Approach for Enhanced Face Image Authentication
    Cheng, Tiancong
    Zhang, Ying
    Yin, Yifang
    Zimmermann, Roger
    Yu, Zhiwen
    Guo, Bin
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 135 - 143
  • [28] MULTI-TEACHER DISTILLATION FOR INCREMENTAL OBJECT DETECTION
    Jiang, Le
    Cheng, Hongqiang
    Ye, Xiaozhou
    Ouyang, Ye
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5520 - 5524
  • [29] MTMS: Multi-teacher Multi-stage Knowledge Distillation for Reasoning-Based Machine Reading Comprehension
    Zhao, Zhuo
    Xie, Zhiwen
    Zhou, Guangyou
    Huang, Jimmy Xiangji
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 1995 - 2005
  • [30] Dissolved oxygen prediction in the Taiwan Strait with the attention-based multi-teacher knowledge distillation model
    Chen, Lei
    Lin, Ye
    Guo, Minquan
    Lu, Wenfang
    Li, Xueding
    Zhang, Zhenchang
    OCEAN & COASTAL MANAGEMENT, 2025, 265