AdaDFKD: Exploring adaptive inter-sample relationship in data-free knowledge distillation

被引：0

作者：

Li, Jingru ^{[1
]}

Zhou, Sheng ^{[1
]}

Li, Liangcheng ^{[1
]}

Wang, Haishuai ^{[1
]}

Bu, Jiajun ^{[1
]}

Yu, Zhi ^{[1
,2
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China

[2] Zhejiang Univ, Zhejiang Prov Key Lab Serv Robot, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China

来源：

NEURAL NETWORKS | 2024年 / 177卷

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Data-free knowledge distillation; Unsupervised representation learning; Knowledge distillation;

D O I：

10.1016/j.neunet.2024.106386

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In scenarios like privacy protection or large-scale data transmission, data-free knowledge distillation (DFKD) methods are proposed to learn Knowledge Distillation (KD) when data is not accessible. They generate pseudo samples by extracting the knowledge from teacher model, and utilize above pseudo samples for KD. The challenge in previous DFKD methods lies in the static nature of their target distributions and they focus on learning the instance-level distributions, causing its reliance on the pretrained teacher model. To address above concerns, our study introduces a novel DFKD approach known as AdaDFKD, designed to establish and utilize relationships among pseudo samples, which is adaptive to the student model, and finally effectively mitigates the aforementioned risk. We achieve this by generating from "easy-to-discriminate"samples to "hardto-discriminate"samples as human does. We design a relationship refinement module (R2M) to optimize the generation process, wherein we learn a progressive conditional distribution of negative samples and maximize the log-likelihood of inter-sample similarity of pseudosamples. Theoretically, we discover that such design of AdaDFKD both minimize the divergence and maximize the mutual information between the distribution of teacher and student models. Above results demonstrate the superiority of our approach over state-of-the-art (SOTA) DFKD methods across various benchmarks, teacher-student pairs, and evaluation metrics, as well as robustness and fast convergence.

引用

页数：15

共 50 条

[11] Data-Free Knowledge Distillation for Heterogeneous Federated Learning
Zhu, Zhuangdi
Hong, Junyuan
Zhou, Jiayu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[12] Contrastive Model Inversion for Data-Free Knowledge Distillation
Fang, Gongfan
Song, Jie
Wang, Xinchao
Shen, Chengchao
Wang, Xingen
Song, Mingli
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2374 - 2380
[13] Data-Free Knowledge Distillation For Image Super-Resolution
Zhang, Yiman
Chen, Hanting
Chen, Xinghao
Deng, Yiping
Xu, Chunjing
Wang, Yunhe
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7848 - 7857
[14] Up to 100 x Faster Data-Free Knowledge Distillation
Fang, Gongfan
Mo, Kanya
Wang, Xinchao
Song, Jie
Bei, Shitao
Zhang, Haofei
Song, Mingli
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6597 - 6604
[15] Double-Generators Network for Data-Free Knowledge Distillation
Zhang J.
Ju J.
Ren Y.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (07): : 1615 - 1627
[16] Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation
Nayak, Gaurav Kumar
Mopuri, Konda Reddy
Chakraborty, Anirban
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1429 - 1437
[17] Unpacking the Gap Box Against Data-Free Knowledge Distillation
Wang, Yang
Qian, Biao
Liu, Haipeng
Rui, Yong
Wang, Meng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6280 - 6291
[18] Data-free Knowledge Distillation based on GNN for Node Classification
Zeng, Xinfeng
Liu, Tao
Zeng, Ming
Wu, Qingqiang
Wang, Meihong
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 243 - 258
[19] Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
Wang, Yuzheng
Chen, Zhaoyu
Yang, Dingkang
Guo, Pinxue
Jiang, Kaixun
Zhang, Wenqiang
Qi, Lizhe
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5776 - 5784
[20] Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation
Do, Kien
Le, Hung
Dung Nguyen
Dang Nguyen
Harikumar, Haripriya
Truyen Tran
Rana, Santu
Venkatesh, Svetha
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

← 1 2 3 4 5 →