PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation

被引:9
|
作者
Kim, Jangho [1 ,2 ]
Chang, Simyung [1 ]
Kwak, Nojun [2 ]
机构
[1] Qualcomm Korea YH, Qualcomm AI Res, Seoul, South Korea
[2] Seoul Natl Univ, Seoul, South Korea
来源
基金
新加坡国家研究基金会;
关键词
keyword spotting; model pruning; model quantization; knowledge distillation;
D O I
10.21437/Interspeech.2021-248
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
As edge devices become prevalent, deploying Deep Neural Networks (DNN) on edge devices has become a critical issue. However, DNN requires a high computational resource which is rarely available for edge devices. To handle this, we propose a novel model compression method for the devices with limited computational resources, called PQK consisting of pruning, quantization, and knowledge distillation (KD) processes. Unlike traditional pruning and KD, PQK makes use of unimportant weights pruned in the pruning process to make a teacher network for training a better student network without pre-training the teacher model. PQK has two phases. Phase 1 exploits iterative pruning and quantization-aware training to make a lightweight and power-efficient model. In phase 2, we make a teacher network by adding unimportant weights unused in phase 1 to a pruned network. By using this teacher network, we train the pruned network as a student network. In doing so, we do not need a pre-trained teacher network for the KD framework because the teacher and the student networks coexist within the same network (See Fig. 1). We apply our method to the recognition model and verify the effectiveness of PQK on keyword spotting (KWS) and image recognition.
引用
收藏
页码:4568 / 4572
页数:5
相关论文
共 50 条
  • [31] Structured Compression by Weight Encryption for Unstructured Pruning and Quantization
    Kwon, Se Jung
    Lee, Dongsoo
    Kim, Byeongwook
    Kapoor, Parichay
    Park, Baeseong
    Wei, Gu-Yeon
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1906 - 1915
  • [32] Lightweight detection network for bridge defects based on model pruning and knowledge distillation
    Guan, Bin
    Li, Junjie
    STRUCTURES, 2024, 62
  • [33] Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization
    Balaskas, Konstantinos
    Karatzas, Andreas
    Sad, Christos
    Siozios, Kostas
    Anagnostopoulos, Iraklis
    Zervakis, Georgios
    Henkel, Jorg
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2024, 12 (04) : 1079 - 1092
  • [34] Effective Model Compression via Stage-wise Pruning
    Ming-Yang Zhang
    Xin-Yi Yu
    Lin-Lin Ou
    Machine Intelligence Research, 2023, 20 : 937 - 951
  • [35] Effective Model Compression via Stage-wise Pruning
    Zhang, Ming-Yang
    Yu, Xin-Yi
    Ou, Lin-Lin
    MACHINE INTELLIGENCE RESEARCH, 2023, 20 (06) : 937 - 951
  • [36] Pruning-and-distillation: One-stage joint compression framework for CNNs via clustering
    Niu, Tao
    Teng, Yinglei
    Jin, Lei
    Zou, Panpan
    Liu, Yiding
    IMAGE AND VISION COMPUTING, 2023, 136
  • [37] Model Compression Based on Knowledge Distillation and Its Application in HRRP
    Chen, Xiaojiao
    An, Zhenyu
    Huang, Liansheng
    He, Shiying
    Wang, Zhen
    PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1268 - 1272
  • [38] Uncertainty-Driven Knowledge Distillation for Language Model Compression
    Huang, Tianyu
    Dong, Weisheng
    Wu, Fangfang
    Li, Xin
    Shi, Guangming
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2850 - 2858
  • [39] Quantization via Distillation and Contrastive Learning
    Pei, Zehua
    Yao, Xufeng
    Zhao, Wenqian
    Yu, Bei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17164 - 17176
  • [40] Heuristic Compression Method for CNN Model Applying Quantization to a Combination of Structured and Unstructured Pruning Techniques
    Tian, Danhe
    Yamagiwa, Shinichi
    Wada, Koichi
    IEEE ACCESS, 2024, 12 : 66680 - 66689