AdaDFKD: Exploring adaptive inter-sample relationship in data-free knowledge distillation

被引:0
|
作者
Li, Jingru [1 ]
Zhou, Sheng [1 ]
Li, Liangcheng [1 ]
Wang, Haishuai [1 ]
Bu, Jiajun [1 ]
Yu, Zhi [1 ,2 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China
[2] Zhejiang Univ, Zhejiang Prov Key Lab Serv Robot, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Data-free knowledge distillation; Unsupervised representation learning; Knowledge distillation;
D O I
10.1016/j.neunet.2024.106386
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In scenarios like privacy protection or large-scale data transmission, data-free knowledge distillation (DFKD) methods are proposed to learn Knowledge Distillation (KD) when data is not accessible. They generate pseudo samples by extracting the knowledge from teacher model, and utilize above pseudo samples for KD. The challenge in previous DFKD methods lies in the static nature of their target distributions and they focus on learning the instance-level distributions, causing its reliance on the pretrained teacher model. To address above concerns, our study introduces a novel DFKD approach known as AdaDFKD, designed to establish and utilize relationships among pseudo samples, which is adaptive to the student model, and finally effectively mitigates the aforementioned risk. We achieve this by generating from "easy-to-discriminate"samples to "hardto-discriminate"samples as human does. We design a relationship refinement module (R2M) to optimize the generation process, wherein we learn a progressive conditional distribution of negative samples and maximize the log-likelihood of inter-sample similarity of pseudosamples. Theoretically, we discover that such design of AdaDFKD both minimize the divergence and maximize the mutual information between the distribution of teacher and student models. Above results demonstrate the superiority of our approach over state-of-the-art (SOTA) DFKD methods across various benchmarks, teacher-student pairs, and evaluation metrics, as well as robustness and fast convergence.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] Data-Free Knowledge Distillation for Heterogeneous Federated Learning
    Zhu, Zhuangdi
    Hong, Junyuan
    Zhou, Jiayu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [12] Contrastive Model Inversion for Data-Free Knowledge Distillation
    Fang, Gongfan
    Song, Jie
    Wang, Xinchao
    Shen, Chengchao
    Wang, Xingen
    Song, Mingli
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2374 - 2380
  • [13] Data-Free Knowledge Distillation For Image Super-Resolution
    Zhang, Yiman
    Chen, Hanting
    Chen, Xinghao
    Deng, Yiping
    Xu, Chunjing
    Wang, Yunhe
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7848 - 7857
  • [14] Up to 100 x Faster Data-Free Knowledge Distillation
    Fang, Gongfan
    Mo, Kanya
    Wang, Xinchao
    Song, Jie
    Bei, Shitao
    Zhang, Haofei
    Song, Mingli
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6597 - 6604
  • [15] Double-Generators Network for Data-Free Knowledge Distillation
    Zhang J.
    Ju J.
    Ren Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (07): : 1615 - 1627
  • [16] Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation
    Nayak, Gaurav Kumar
    Mopuri, Konda Reddy
    Chakraborty, Anirban
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1429 - 1437
  • [17] Unpacking the Gap Box Against Data-Free Knowledge Distillation
    Wang, Yang
    Qian, Biao
    Liu, Haipeng
    Rui, Yong
    Wang, Meng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6280 - 6291
  • [18] Data-free Knowledge Distillation based on GNN for Node Classification
    Zeng, Xinfeng
    Liu, Tao
    Zeng, Ming
    Wu, Qingqiang
    Wang, Meihong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 243 - 258
  • [19] Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
    Wang, Yuzheng
    Chen, Zhaoyu
    Yang, Dingkang
    Guo, Pinxue
    Jiang, Kaixun
    Zhang, Wenqiang
    Qi, Lizhe
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5776 - 5784
  • [20] Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation
    Do, Kien
    Le, Hung
    Dung Nguyen
    Dang Nguyen
    Harikumar, Haripriya
    Truyen Tran
    Rana, Santu
    Venkatesh, Svetha
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,