Multi-level Logit Distillation

被引：38

作者：

Jin, Ying ^{[1
]}

Wang, Jiaqi ^{[2
]}

Lin, Dahua ^{[1
,2
]}

机构：

[1] Chinese Univ Hong Kong, CUHK Sense Time Joint Lab, Hong Kong, Peoples R China

[2] Shanghai AI Lab, Shanghai, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.02325

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge Distillation (KD) aims at distilling the knowledge from the large teacher model to a lightweight student model. Mainstream KD methods can be divided into two categories, logit distillation, and feature distillation. The former is easy to implement, but inferior in performance, while the latter is not applicable to some practical circumstances due to concerns such as privacy and safety. Towards this dilemma, in this paper, we explore a stronger logit distillation method via making better utilization of logit outputs. Concretely, we propose a simple yet effective approach to logit distillation via multi-level prediction alignment. Through this framework, the prediction alignment is not only conducted at the instance level, but also at the batch and class level, through which the student model learns instance prediction, input correlation, and category correlation simultaneously. In addition, a prediction augmentation mechanism based on model calibration further boosts the performance. Extensive experiment results validate that our method enjoys consistently higher performance than previous logit distillation methods, and even reaches competitive performance with mainstream feature distillation methods. Code is available at https://github.com/Jin-Ying/Multi-Level-Logit-Distillation.

引用

页码：24276 / 24285

页数：10

共 50 条

[41] Fungal biodegradation and multi-level toxicity assessment of vinasse from distillation of winemaking by-products
Fernandes, Joana M. C.
Sousa, Rose Marie O. F.
Fraga, Irene
Sampaio, Ana
Amaral, Carla
Bezerra, Rui M. F.
Dias, Albino A.
CHEMOSPHERE, 2020, 238
[42] Hierarchical Bayesian learning framework for multi-level modeling using multi-level data
Jia, Xinyu
Papadimitriou, Costas
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2022, 179
[43] Multi-level Governance, Multi-level Deficits: The Case of Drinking Water Management in Hungary
Leventon, Julia
Antypas, Alexios
ENVIRONMENTAL POLICY AND GOVERNANCE, 2012, 22 (04) : 253 - 267
[44] Multi-Stage Training with Multi-Level Knowledge Self-Distillation for Fine-Grained Image Recognition
Yu Y.
Wei W.
Tang H.
Qian J.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (08): : 1834 - 1845
[45] A multi-level model for preferences
Gabrielsen, G
FOOD QUALITY AND PREFERENCE, 2001, 12 (5-7) : 337 - 344
[46] Handbook on multi-level governance
Hurrelmann, Achim
ENVIRONMENT AND PLANNING C-GOVERNMENT AND POLICY, 2012, 30 (01): : 182 - 183
[47] EVALUATION IN A MULTI-LEVEL AGENCY
KRESH, E
JOURNAL OF RESEARCH AND DEVELOPMENT IN EDUCATION, 1970, 3 (04): : 97 - 102
[48] Inclusion as a multi-level concept
Shore, Lynn M.
Chung, Beth G.
CURRENT OPINION IN PSYCHOLOGY, 2025, 60
[49] OVERVIEW OF MULTI-LEVEL METALLIZATION
TOTTA, PA
JOURNAL OF ELECTRONIC MATERIALS, 1988, 17 (04) : S10 - S10
[50] MULTI-LEVEL RISK AGGREGATION
Filipovic, Damir
ASTIN BULLETIN, 2009, 39 (02): : 565 - 575

← 1 2 3 4 5 →