Multi-level Logit Distillation

被引:38
|
作者
Jin, Ying [1 ]
Wang, Jiaqi [2 ]
Lin, Dahua [1 ,2 ]
机构
[1] Chinese Univ Hong Kong, CUHK Sense Time Joint Lab, Hong Kong, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.02325
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge Distillation (KD) aims at distilling the knowledge from the large teacher model to a lightweight student model. Mainstream KD methods can be divided into two categories, logit distillation, and feature distillation. The former is easy to implement, but inferior in performance, while the latter is not applicable to some practical circumstances due to concerns such as privacy and safety. Towards this dilemma, in this paper, we explore a stronger logit distillation method via making better utilization of logit outputs. Concretely, we propose a simple yet effective approach to logit distillation via multi-level prediction alignment. Through this framework, the prediction alignment is not only conducted at the instance level, but also at the batch and class level, through which the student model learns instance prediction, input correlation, and category correlation simultaneously. In addition, a prediction augmentation mechanism based on model calibration further boosts the performance. Extensive experiment results validate that our method enjoys consistently higher performance than previous logit distillation methods, and even reaches competitive performance with mainstream feature distillation methods. Code is available at https://github.com/Jin-Ying/Multi-Level-Logit-Distillation.
引用
收藏
页码:24276 / 24285
页数:10
相关论文
共 50 条
  • [21] Joule heating membrane distillation enhancement with multi-level thermal concentration and heat recovery
    Huang, Jian
    He, Yurong
    Shen, Zhiheng
    ENERGY CONVERSION AND MANAGEMENT, 2021, 238
  • [22] Multi-level semantic enhancement based on self-distillation BERT for Chinese named
    Li, Zepeng
    Cao, Shuo
    Zhai, Minyu
    Ding, Nengneng
    Zhang, Zhenwen
    Hu, Bin
    NEUROCOMPUTING, 2024, 586
  • [23] MKDAT: Multi-Level Knowledge Distillation with Adaptive Temperature for Distantly Supervised Relation Extraction
    Long, Jun
    Yin, Zhuoying
    Han, Yan
    Huang, Wenti
    INFORMATION, 2024, 15 (07)
  • [24] Multi-level modelling via stochastic multi-level multiset rewriting
    Oury, Nicolas
    Plotkin, Gordon
    MATHEMATICAL STRUCTURES IN COMPUTER SCIENCE, 2013, 23 (02) : 471 - 503
  • [25] Multi-Level Converters
    Bose, Bimal K.
    ELECTRONICS, 2015, 4 (03): : 582 - 585
  • [26] Multi-level Governance
    Pearce, Graham
    REGIONAL AND FEDERAL STUDIES, 2005, 15 (01): : 129 - 130
  • [27] Multi-level governance
    Bovaird, T
    PUBLIC ADMINISTRATION, 2005, 83 (04) : 966 - 969
  • [28] Multi-level analysis
    Lydersen, Stian
    TIDSSKRIFT FOR DEN NORSKE LAEGEFORENING, 2024, 144 (12)
  • [29] Multi-level systems
    Lin, Yi, 1875, Taylor and Francis Ltd. (20):
  • [30] Multi-level visioning
    Yearout, S
    Miles, G
    Koonce, RH
    TRAINING & DEVELOPMENT, 2001, 55 (03): : 30 - +