Dual-Level Knowledge Distillation via Knowledge Alignment and Correlation

被引：6

作者：

Ding, Fei ^{[1
]}

Yang, Yin ^{[1
]}

Hu, Hongxin ^{[2
]}

Krovi, Venkat ^{[3
,4
]}

Luo, Feng ^{[1
]}

机构：

[1] Clemson Univ, Sch Comp, Clemson, SC 29634 USA

[2] Buffalo State Univ New York, Dept Comp Sci & Engn, Buffalo, NY 14260 USA

[3] Clemson Univ, Dept Automot Engn, Clemson, SC 29634 USA

[4] Clemson Univ, Dept Mech Engn, Clemson, SC 29634 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 02期

基金：

美国国家科学基金会;

关键词：

Correlation; Knowledge engineering; Task analysis; Standards; Network architecture; Prototypes; Training; Convolutional neural networks; dual-level knowledge; knowledge distillation (KD); representation learning; teacher-student model;

D O I：

10.1109/TNNLS.2022.3190166

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge distillation (KD) has become a widely used technique for model compression and knowledge transfer. We find that the standard KD method performs the knowledge alignment on an individual sample indirectly via class prototypes and neglects the structural knowledge between different samples, namely, knowledge correlation. Although recent contrastive learning-based distillation methods can be decomposed into knowledge alignment and correlation, their correlation objectives undesirably push apart representations of samples from the same class, leading to inferior distillation results. To improve the distillation performance, in this work, we propose a novel knowledge correlation objective and introduce the dual-level knowledge distillation (DLKD), which explicitly combines knowledge alignment and correlation together instead of using one single contrastive objective. We show that both knowledge alignment and correlation are necessary to improve the distillation performance. In particular, knowledge correlation can serve as an effective regularization to learn generalized representations. The proposed DLKD is task-agnostic and model-agnostic, and enables effective knowledge transfer from supervised or self-supervised pretrained teachers to students. Experiments show that DLKD outperforms other state-of-the-art methods on a large number of experimental settings including: 1) pretraining strategies; 2) network architectures; 3) datasets; and 4) tasks.

引用

页码：2425 / 2435

页数：11

共 50 条

[1] Few-Shot Graph Anomaly Detection via Dual-Level Knowledge Distillation
Li, Xuan
Cheng, Dejie
Zhang, Luheng
Zhang, Chengfang
Feng, Ziliang
ENTROPY, 2025, 27 (01)
[2] Knowledge Distillation via Channel Correlation Structure
Li, Bo
Chen, Bin
Wang, Yunxiao
Dai, Tao
Hu, Maowei
Jiang, Yong
Xia, Shutao
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 357 - 368
[3] Efficient Crowd Counting via Dual Knowledge Distillation
Wang, Rui
Hao, Yixue
Hu, Long
Li, Xianzhi
Chen, Min
Miao, Yiming
Humar, Iztok
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 569 - 583
[4] Implicit Feature Alignment For Knowledge Distillation
Chen, Dingyao
Wang, Mengzhu
Zhang, Xiang
Liang, Tianyi
Luo, Zhigang
2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 402 - 408
[5] Enhancement of Knowledge Distillation via Non-Linear Feature Alignment
Feng Jiangxiao Zhang
Lina Gao
Hongliang Huo
Ying Wang
Optical Memory and Neural Networks, 2023, 32 : 310 - 317
[6] Enhancement of Knowledge Distillation via Non-Linear Feature Alignment
Zhang, Jiangxiao
Gao, Feng
Huo, Lina
Wang, Hongliang
Dang, Ying
OPTICAL MEMORY AND NEURAL NETWORKS, 2023, 32 (04) : 310 - 317
[7] Dual-Level Adaptive and Discriminative Knowledge Transfer for Cross-Domain Recognition
Meng, Min
Lan, Mengcheng
Yu, Jun
Wu, Jigang
Liu, Ligang
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2266 - 2279
[8] Correlation Congruence for Knowledge Distillation
Peng, Baoyun
Jin, Xiao
Liu, Jiaheng
Li, Dongsheng
Wu, Yichao
Liu, Yu
Zhou, Shunfeng
Zhang, Zhaoning
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5006 - 5015
[9] Dual Teacher Knowledge Distillation With Domain Alignment for Face Anti-Spoofing
Kong, Zhe
Zhang, Wentian
Wang, Tao
Zhang, Kaihao
Li, Yuexiang
Tang, Xiaoying
Luo, Wenhan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13177 - 13189
[10] Knowledge distillation via instance-level sequence learning
Zhao, Haoran
Sun, Xin
Dong, Junyu
Dong, Zihe
Li, Qiong
KNOWLEDGE-BASED SYSTEMS, 2021, 233

← 1 2 3 4 5 →