Multi-level Logit Distillation

被引:38
|
作者
Jin, Ying [1 ]
Wang, Jiaqi [2 ]
Lin, Dahua [1 ,2 ]
机构
[1] Chinese Univ Hong Kong, CUHK Sense Time Joint Lab, Hong Kong, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.02325
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge Distillation (KD) aims at distilling the knowledge from the large teacher model to a lightweight student model. Mainstream KD methods can be divided into two categories, logit distillation, and feature distillation. The former is easy to implement, but inferior in performance, while the latter is not applicable to some practical circumstances due to concerns such as privacy and safety. Towards this dilemma, in this paper, we explore a stronger logit distillation method via making better utilization of logit outputs. Concretely, we propose a simple yet effective approach to logit distillation via multi-level prediction alignment. Through this framework, the prediction alignment is not only conducted at the instance level, but also at the batch and class level, through which the student model learns instance prediction, input correlation, and category correlation simultaneously. In addition, a prediction augmentation mechanism based on model calibration further boosts the performance. Extensive experiment results validate that our method enjoys consistently higher performance than previous logit distillation methods, and even reaches competitive performance with mainstream feature distillation methods. Code is available at https://github.com/Jin-Ying/Multi-Level-Logit-Distillation.
引用
收藏
页码:24276 / 24285
页数:10
相关论文
共 50 条
  • [41] Fungal biodegradation and multi-level toxicity assessment of vinasse from distillation of winemaking by-products
    Fernandes, Joana M. C.
    Sousa, Rose Marie O. F.
    Fraga, Irene
    Sampaio, Ana
    Amaral, Carla
    Bezerra, Rui M. F.
    Dias, Albino A.
    CHEMOSPHERE, 2020, 238
  • [42] Hierarchical Bayesian learning framework for multi-level modeling using multi-level data
    Jia, Xinyu
    Papadimitriou, Costas
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2022, 179
  • [43] Multi-level Governance, Multi-level Deficits: The Case of Drinking Water Management in Hungary
    Leventon, Julia
    Antypas, Alexios
    ENVIRONMENTAL POLICY AND GOVERNANCE, 2012, 22 (04) : 253 - 267
  • [44] Multi-Stage Training with Multi-Level Knowledge Self-Distillation for Fine-Grained Image Recognition
    Yu Y.
    Wei W.
    Tang H.
    Qian J.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (08): : 1834 - 1845
  • [45] A multi-level model for preferences
    Gabrielsen, G
    FOOD QUALITY AND PREFERENCE, 2001, 12 (5-7) : 337 - 344
  • [46] Handbook on multi-level governance
    Hurrelmann, Achim
    ENVIRONMENT AND PLANNING C-GOVERNMENT AND POLICY, 2012, 30 (01): : 182 - 183
  • [47] EVALUATION IN A MULTI-LEVEL AGENCY
    KRESH, E
    JOURNAL OF RESEARCH AND DEVELOPMENT IN EDUCATION, 1970, 3 (04): : 97 - 102
  • [48] Inclusion as a multi-level concept
    Shore, Lynn M.
    Chung, Beth G.
    CURRENT OPINION IN PSYCHOLOGY, 2025, 60
  • [49] OVERVIEW OF MULTI-LEVEL METALLIZATION
    TOTTA, PA
    JOURNAL OF ELECTRONIC MATERIALS, 1988, 17 (04) : S10 - S10
  • [50] MULTI-LEVEL RISK AGGREGATION
    Filipovic, Damir
    ASTIN BULLETIN, 2009, 39 (02): : 565 - 575