Dual discriminator adversarial distillation for data-free model compression

被引:12
|
作者
Zhao, Haoran [1 ]
Sun, Xin [1 ,2 ]
Dong, Junyu [1 ]
Manic, Milos [3 ]
Zhou, Huiyu [4 ]
Yu, Hui [5 ]
机构
[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao, Peoples R China
[2] Tech Univ Munich, Dept Aerosp & Geodesy, Munich, Germany
[3] Virginia Commonwealth Univ, Coll Engn, Richmond, VA USA
[4] Univ Leicester, Sch Informat, Leicester, Leics, England
[5] Univ Portsmouth, Sch Creat Technol, Portsmouth, Hants, England
基金
中国国家自然科学基金;
关键词
Deep neural networks; Image classification; Model compression; Knowledge distillation; Data-free; KNOWLEDGE; NETWORK; RECOGNITION;
D O I
10.1007/s13042-021-01443-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to access the original training data, which usually has a huge size and is often unavailable. To tackle this problem, we propose a novel data-free approach in this paper, named Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without the need of any training data or meta-data. To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data. The generator not only uses the pre-trained teacher's intrinsic statistics in existing batch normalization layers but also obtains the maximum discrepancy from the student model. Then the generated samples are used to train the compact student network under the supervision of the teacher. The proposed method obtains an efficient student network which closely approximates its teacher network, without using the original training data. Extensive experiments are conducted to demonstrate the effectiveness of the proposed approach on CIFAR, Caltech101 and ImageNet datasets for classification tasks. Moreover, we extend our method to semantic segmentation tasks on several public datasets such as CamVid, NYUv2, Cityscapes and VOC 2012. To the best of our knowledge, this is the first work on generative model based data-free knowledge distillation on large-scale datasets such as ImageNet, Cityscapes and VOC 2012. Experiments show that our method outperforms all baselines for data-free knowledge distillation.
引用
收藏
页码:1213 / 1230
页数:18
相关论文
共 50 条
  • [1] Dual discriminator adversarial distillation for data-free model compression
    Haoran Zhao
    Xin Sun
    Junyu Dong
    Milos Manic
    Huiyu Zhou
    Hui Yu
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 1213 - 1230
  • [2] Dual-discriminator adversarial framework for data-free quantization
    Li, Zhikai
    Ma, Liping
    Long, Xianlei
    Xiao, Junrui
    Gu, Qingyi
    NEUROCOMPUTING, 2022, 511 : 67 - 77
  • [3] Data-Free Network Quantization With Adversarial Knowledge Distillation
    Choi, Yoojin
    Choi, Jihwan
    El-Khamy, Mostafa
    Lee, Jungwon
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3047 - 3057
  • [4] Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation
    Do, Kien
    Le, Hung
    Dung Nguyen
    Dang Nguyen
    Harikumar, Haripriya
    Truyen Tran
    Rana, Santu
    Venkatesh, Svetha
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] ENHANCING DATA-FREE ADVERSARIAL DISTILLATION WITH ACTIVATION REGULARIZATION AND VIRTUAL INTERPOLATION
    Qu, Xiaoyang
    Wang, Jianzong
    Xiao, Jing
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3340 - 3344
  • [6] Adversarial Self-Supervised Data-Free Distillation for Text Classification
    Ma, Xinyin
    Shen, Yongliang
    Fang, Gongfan
    Chen, Chen
    Jia, Chenghao
    Lu, Weiming
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6182 - 6192
  • [7] DATA-FREE WATERMARK FOR DEEP NEURAL NETWORKS BY TRUNCATED ADVERSARIAL DISTILLATION
    Yan, Chao-Bo
    Li, Fang-Qi
    Wang, Shi-Lin
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 4480 - 4484
  • [8] Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
    Wang, Yuzheng
    Chen, Zhaoyu
    Yang, Dingkang
    Guo, Pinxue
    Jiang, Kaixun
    Zhang, Wenqiang
    Qi, Lizhe
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5776 - 5784
  • [9] Data-Free Ensemble Knowledge Distillation for Privacy-conscious Multimedia Model Compression
    Hao, Zhiwei
    Luo, Yong
    Hu, Han
    An, Jianping
    Wen, Yonggang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1803 - 1811
  • [10] Data-Free Network Pruning for Model Compression
    Tang, Jialiang
    Liu, Mingjin
    Jiang, Ning
    Cai, Huan
    Yu, Wenxin
    Zhou, Jinjia
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,