PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation

被引:9
|
作者
Kim, Jangho [1 ,2 ]
Chang, Simyung [1 ]
Kwak, Nojun [2 ]
机构
[1] Qualcomm Korea YH, Qualcomm AI Res, Seoul, South Korea
[2] Seoul Natl Univ, Seoul, South Korea
来源
基金
新加坡国家研究基金会;
关键词
keyword spotting; model pruning; model quantization; knowledge distillation;
D O I
10.21437/Interspeech.2021-248
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
As edge devices become prevalent, deploying Deep Neural Networks (DNN) on edge devices has become a critical issue. However, DNN requires a high computational resource which is rarely available for edge devices. To handle this, we propose a novel model compression method for the devices with limited computational resources, called PQK consisting of pruning, quantization, and knowledge distillation (KD) processes. Unlike traditional pruning and KD, PQK makes use of unimportant weights pruned in the pruning process to make a teacher network for training a better student network without pre-training the teacher model. PQK has two phases. Phase 1 exploits iterative pruning and quantization-aware training to make a lightweight and power-efficient model. In phase 2, we make a teacher network by adding unimportant weights unused in phase 1 to a pruned network. By using this teacher network, we train the pruned network as a student network. In doing so, we do not need a pre-trained teacher network for the KD framework because the teacher and the student networks coexist within the same network (See Fig. 1). We apply our method to the recognition model and verify the effectiveness of PQK on keyword spotting (KWS) and image recognition.
引用
收藏
页码:4568 / 4572
页数:5
相关论文
共 50 条
  • [21] Triplet Knowledge Distillation Networks for Model Compression
    Tang, Jialiang
    Jiang, Ning
    Yu, Wenxin
    Wu, Wenqin
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [22] Analysis of Model Compression Using Knowledge Distillation
    Hong, Yu-Wei
    Leu, Jenq-Shiou
    Faisal, Muhamad
    Prakosa, Setya Widyawan
    IEEE ACCESS, 2022, 10 : 85095 - 85105
  • [23] EPSD: Early Pruning with Self-Distillation for Efficient Model Compression
    Chen, Dong
    Liu, Ning
    Zhu, Yichen
    Che, Zhengping
    Ma, Rui
    Zhang, Fachao
    Mou, Xiaofeng
    Chang, Yi
    Tang, Jian
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11258 - 11266
  • [24] Iterative Transfer Knowledge Distillation and Channel Pruning for Unsupervised Cross-Domain Compression
    Wang, Zhiyuan
    Shi, Long
    Mei, Zhen
    Zhao, Xiang
    Wang, Zhe
    Li, Jun
    WEB INFORMATION SYSTEMS AND APPLICATIONS, WISA 2024, 2024, 14883 : 3 - 15
  • [25] STRUCTURED PRUNING AND QUANTIZATION FOR LEARNED IMAGE COMPRESSION
    Hossain, Md Adnan Faisal
    Zhu, Fengqing
    2024 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2024, : 3730 - 3736
  • [26] Compressed MoE ASR Model Based on Knowledge Distillation and Quantization
    Yuan, Yuping
    You, Zhao
    Feng, Shulin
    Su, Dan
    Liang, Yanchun
    Shi, Xiaohu
    Yu, Dong
    INTERSPEECH 2023, 2023, : 3337 - 3341
  • [27] Semantic Segmentation Optimization Algorithm Based on Knowledge Distillation and Model Pruning
    Yao, Weiwei
    Zhang, Jie
    Li, Chen
    Li, Shiyun
    He, Li
    Zhang, Bo
    2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, : 261 - 265
  • [28] AN EFFICIENT METHOD FOR MODEL PRUNING USING KNOWLEDGE DISTILLATION WITH FEW SAMPLES
    Zhou, ZhaoJing
    Zhou, Yun
    Jiang, Zhuqing
    Men, Aidong
    Wang, Haiying
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2515 - 2519
  • [29] Using Distillation to Improve Network Performance after Pruning and Quantization
    Bao, Zhenshan
    Liu, Jiayang
    Zhang, Wenbo
    PROCEEDINGS OF THE 2019 2ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND MACHINE INTELLIGENCE (MLMI 2019), 2019, : 3 - 6
  • [30] A hybrid model compression approach via knowledge distillation for predicting energy consumption in additive manufacturing
    Li, Yixin
    Hu, Fu
    Liu, Ying
    Ryan, Michael
    Wang, Ray
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2023, 61 (13) : 4525 - 4547