De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts

被引：2

作者：

Wang, Yuzheng ^{[1
]}

Yang, Dingkang ^{[1
]}

Chen, Zhaoyu ^{[1
]}

Liu, Yang ^{[1
]}

Liu, Siao ^{[1
]}

Zhang, Wenqiang ^{[2
]}

Zhang, Lihua ^{[1
]}

Qi, Lizhe ^{[1
,2
,3
]}

机构：

[1] Fudan Univ, Acad Engn & Technol, Shanghai Engn Res Ctr AI & Robot, Shanghai, Peoples R China

[2] Fudan Univ, Acad Engn & Technol, Engn Res Ctr AI & Robot, Minist Educ, Shanghai, Peoples R China

[3] Green Ecol Smart Technol Sch Enterprise Joint Res, Shanghai, Peoples R China

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年

关键词：

CAUSAL INFERENCE;

D O I：

10.1109/CVPR52733.2024.01199

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data-Free Knowledge Distillation (DFKD) is a promising task to train high-performance small models to enhance actual deployment without relying on the original training data. Existing methods commonly avoid relying on private data by utilizing synthetic or sampled data. However, a long-overlooked issue is that the severe distribution shifts between their substitution and original data, which manifests as huge differences in the quality of images and class proportions. The harmful shifts are essentially the confounder that significantly causes performance bottlenecks. To tackle the issue, this paper proposes a novel perspective with causal inference to disentangle the student models from the impact of such shifts. By designing a customized causal graph, we first reveal the causalities among the variables in the DFKD task. Subsequently, we propose a Knowledge Distillation Causal Intervention ( KDCI) framework based on the backdoor adjustment to de-confound the confounder. KDCI can be flexibly combined with most existing state-of-the-art baselines. Experiments in combination with six representative DFKD methods demonstrate the effectiveness of our KDCI, which can obviously help existing methods under almost all settings, e.g., improving the baseline by up to 15.54% accuracy on the CIFAR-100 dataset.

引用

页码：12615 / 12625

页数：11

共 50 条

[31] Discovering and Overcoming Limitations of Noise-engineered Data-free Knowledge Distillation
Raikwar, Piyush
Mishra, Deepak
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[32] Dynamic data-free knowledge distillation by easy-to-hard learning strategy
Li, Jingru
Zhou, Sheng
Li, Liangcheng
Wang, Haishuai
Bu, Jiajun
Yu, Zhi
INFORMATION SCIENCES, 2023, 642
[33] Impartial Adversarial Distillation: Addressing Biased Data-Free Knowledge Distillation via Adaptive Constrained Optimization
Liao, Donping
Gao, Xitong
Xu, Chengzhong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3342 - 3350
[34] AdaDFKD: Exploring adaptive inter-sample relationship in data-free knowledge distillation
Li, Jingru
Zhou, Sheng
Li, Liangcheng
Wang, Haishuai
Bu, Jiajun
Yu, Zhi
NEURAL NETWORKS, 2024, 177
[35] FedTAD: Topology-aware Data-free Knowledge Distillation for Subgraph Federated Learning
Zhu, Yinlin
Lie, Xunkai
Wu, Zhengyu
Wu, Di
Hu, Miao
Li, Rong-Hua
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5716 - 5724
[36] Data-Free Ensemble Knowledge Distillation for Privacy-conscious Multimedia Model Compression
Hao, Zhiwei
Luo, Yong
Hu, Han
An, Jianping
Wen, Yonggang
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1803 - 1811
[37] Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay
Binici, Kuluhan
Aggarwal, Shivam
Pham, Nam Trung
Leman, Karianto
Mitra, Tulika
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6089 - 6096
[38] Reusable generator data-free knowledge distillation with hard loss simulation for image classification
Sun, Yafeng
Wang, Xingwang
Huang, Junhong
Chen, Shilin
Hou, Minghui
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 265
[39] A Data-Free Distillation Framework for Adaptive Bitrate Algorithms
Huang T.-C.
Li C.-Y.
Zhang R.-X.
Li W.-Z.
Sun L.-F.
Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (01): : 113 - 130
[40] Data-free knowledge distillation via generator-free data generation for Non-IID federated learning
Zhao, Siran
Liao, Tianchi
Fu, Lele
Chen, Chuan
Bian, Jing
Zheng, Zibin
NEURAL NETWORKS, 2024, 179

← 1 2 3 4 5 →