PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation

被引:9
|
作者
Kim, Jangho [1 ,2 ]
Chang, Simyung [1 ]
Kwak, Nojun [2 ]
机构
[1] Qualcomm Korea YH, Qualcomm AI Res, Seoul, South Korea
[2] Seoul Natl Univ, Seoul, South Korea
来源
基金
新加坡国家研究基金会;
关键词
keyword spotting; model pruning; model quantization; knowledge distillation;
D O I
10.21437/Interspeech.2021-248
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
As edge devices become prevalent, deploying Deep Neural Networks (DNN) on edge devices has become a critical issue. However, DNN requires a high computational resource which is rarely available for edge devices. To handle this, we propose a novel model compression method for the devices with limited computational resources, called PQK consisting of pruning, quantization, and knowledge distillation (KD) processes. Unlike traditional pruning and KD, PQK makes use of unimportant weights pruned in the pruning process to make a teacher network for training a better student network without pre-training the teacher model. PQK has two phases. Phase 1 exploits iterative pruning and quantization-aware training to make a lightweight and power-efficient model. In phase 2, we make a teacher network by adding unimportant weights unused in phase 1 to a pruned network. By using this teacher network, we train the pruned network as a student network. In doing so, we do not need a pre-trained teacher network for the KD framework because the teacher and the student networks coexist within the same network (See Fig. 1). We apply our method to the recognition model and verify the effectiveness of PQK on keyword spotting (KWS) and image recognition.
引用
收藏
页码:4568 / 4572
页数:5
相关论文
共 50 条
  • [1] Compression of Acoustic Model via Knowledge Distillation and Pruning
    Li, Chenxing
    Zhu, Lei
    Xu, Shuang
    Gao, Peng
    Xu, Bo
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2785 - 2790
  • [2] Model compression via pruning and knowledge distillation for person re-identification
    Xie, Haonan
    Jiang, Wei
    Luo, Hao
    Yu, Hongyan
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (02) : 2149 - 2161
  • [3] Model compression via pruning and knowledge distillation for person re-identification
    Haonan Xie
    Wei Jiang
    Hao Luo
    Hongyan Yu
    Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 2149 - 2161
  • [4] Quantization Robust Pruning With Knowledge Distillation
    Kim, Jangho
    IEEE ACCESS, 2023, 11 : 26419 - 26426
  • [5] Private Model Compression via Knowledge Distillation
    Wang, Ji
    Bao, Weidong
    Sun, Lichao
    Zhu, Xiaomin
    Cao, Bokai
    Yu, Philip S.
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1190 - +
  • [6] Efficient and Controllable Model Compression through Sequential Knowledge Distillation and Pruning
    Malihi, Leila
    Heidemann, Gunther
    BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (03)
  • [7] Iterative knowledge distillation and pruning for model compression in unsupervised domain adaptation
    Wang, Zhiyuan
    Shi, Long
    Mei, Zhen
    Zhao, Xiang
    Wang, Zhe
    Li, Jun
    PATTERN RECOGNITION, 2025, 164
  • [8] Model Compression by Iterative Pruning with Knowledge Distillation and Its Application to Speech Enhancement
    Wei, Zeyuan
    Li, Hao
    Zhang, Xueliang
    INTERSPEECH 2022, 2022, : 941 - 945
  • [9] Joint structured pruning and dense knowledge distillation for efficient transformer model compression
    Cui, Baiyun
    Li, Yingming
    Zhang, Zhongfei
    NEUROCOMPUTING, 2021, 458 : 56 - 69
  • [10] Combining Weight Pruning and Knowledge Distillation For CNN Compression
    Aghli, Nima
    Ribeiro, Eraldo
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3185 - 3192