AdaDFKD: Exploring adaptive inter-sample relationship in data-free knowledge distillation

被引:0
|
作者
Li, Jingru [1 ]
Zhou, Sheng [1 ]
Li, Liangcheng [1 ]
Wang, Haishuai [1 ]
Bu, Jiajun [1 ]
Yu, Zhi [1 ,2 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China
[2] Zhejiang Univ, Zhejiang Prov Key Lab Serv Robot, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Data-free knowledge distillation; Unsupervised representation learning; Knowledge distillation;
D O I
10.1016/j.neunet.2024.106386
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In scenarios like privacy protection or large-scale data transmission, data-free knowledge distillation (DFKD) methods are proposed to learn Knowledge Distillation (KD) when data is not accessible. They generate pseudo samples by extracting the knowledge from teacher model, and utilize above pseudo samples for KD. The challenge in previous DFKD methods lies in the static nature of their target distributions and they focus on learning the instance-level distributions, causing its reliance on the pretrained teacher model. To address above concerns, our study introduces a novel DFKD approach known as AdaDFKD, designed to establish and utilize relationships among pseudo samples, which is adaptive to the student model, and finally effectively mitigates the aforementioned risk. We achieve this by generating from "easy-to-discriminate"samples to "hardto-discriminate"samples as human does. We design a relationship refinement module (R2M) to optimize the generation process, wherein we learn a progressive conditional distribution of negative samples and maximize the log-likelihood of inter-sample similarity of pseudosamples. Theoretically, we discover that such design of AdaDFKD both minimize the divergence and maximize the mutual information between the distribution of teacher and student models. Above results demonstrate the superiority of our approach over state-of-the-art (SOTA) DFKD methods across various benchmarks, teacher-student pairs, and evaluation metrics, as well as robustness and fast convergence.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Conditional generative data-free knowledge distillation
    Yu, Xinyi
    Yan, Ling
    Yang, Yang
    Zhou, Libo
    Ou, Linlin
    IMAGE AND VISION COMPUTING, 2023, 131
  • [2] Data-free Knowledge Distillation for Object Detection
    Chawla, Akshay
    Yin, Hongxu
    Molchanov, Pavlo
    Alvarez, Jose
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3288 - 3297
  • [3] A Data-Free Distillation Framework for Adaptive Bitrate Algorithms
    Huang T.-C.
    Li C.-Y.
    Zhang R.-X.
    Li W.-Z.
    Sun L.-F.
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (01): : 113 - 130
  • [4] Impartial Adversarial Distillation: Addressing Biased Data-Free Knowledge Distillation via Adaptive Constrained Optimization
    Liao, Donping
    Gao, Xitong
    Xu, Chengzhong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3342 - 3350
  • [5] Data-Free Network Quantization With Adversarial Knowledge Distillation
    Choi, Yoojin
    Choi, Jihwan
    El-Khamy, Mostafa
    Lee, Jungwon
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3047 - 3057
  • [6] ROBUSTNESS AND DIVERSITY SEEKING DATA-FREE KNOWLEDGE DISTILLATION
    Han, Pengchao
    Park, Jihong
    Wang, Shiqiang
    Liu, Yejun
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2740 - 2744
  • [7] Data-free knowledge distillation in neural networks for regression
    Kang, Myeonginn
    Kang, Seokho
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 175
  • [8] Empirical Study of Data-Free Iterative Knowledge Distillation
    Shah, Het
    Vaswani, Ashwin
    Dash, Tirtharaj
    Hebbalaguppe, Ramya
    Srinivasan, Ashwin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 546 - 557
  • [9] Data-free Knowledge Distillation for Reusing Recommendation Models
    Wang, Cheng
    Sun, Jiacheng
    Dong, Zhenhua
    Zhu, Jieming
    Li, Zhenguo
    Li, Ruixuan
    Zhang, Rui
    PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 386 - 395
  • [10] Variational Data-Free Knowledge Distillation for Continual Learning
    Li, Xiaorong
    Wang, Shipeng
    Sun, Jian
    Xu, Zongben
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12618 - 12634