Model Compression for Deep Neural Networks: A Survey

被引:68
|
作者
Li, Zhuo [1 ]
Li, Hengyi [1 ]
Meng, Lin [2 ]
机构
[1] Ritsumeikan Univ, Grad Sch Sci & Engn, 1-1-1 Noji Higashi, Kusatsu 5258577, Japan
[2] Ritsumeikan Univ, Coll Sci & Engn, 1-1-1 Noji Higashi, Kusatsu 5258577, Japan
关键词
deep neural networks; model compression; model pruning; parameter quantization; low-rank decomposition; knowledge distillation; lightweight model design; KNOWLEDGE;
D O I
10.3390/computers12030060
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Currently, with the rapid development of deep learning, deep neural networks (DNNs) have been widely applied in various computer vision tasks. However, in the pursuit of performance, advanced DNN models have become more complex, which has led to a large memory footprint and high computation demands. As a result, the models are difficult to apply in real time. To address these issues, model compression has become a focus of research. Furthermore, model compression techniques play an important role in deploying models on edge devices. This study analyzed various model compression methods to assist researchers in reducing device storage space, speeding up model inference, reducing model complexity and training costs, and improving model deployment. Hence, this paper summarized the state-of-the-art techniques for model compression, including model pruning, parameter quantization, low-rank decomposition, knowledge distillation, and lightweight model design. In addition, this paper discusses research challenges and directions for future work.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] A survey of quantization methods for deep neural networks
    Yang C.
    Zhang R.
    Huang L.
    Ti S.
    Lin J.
    Dong Z.
    Chen S.
    Liu Y.
    Yin X.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2023, 45 (10): : 1613 - 1629
  • [32] Linear Regularized Compression of Deep Convolutional Neural Networks
    Ceruti, Claudio
    Campadelli, Paola
    Casiraghi, Elena
    IMAGE ANALYSIS AND PROCESSING,(ICIAP 2017), PT I, 2017, 10484 : 244 - 253
  • [33] Accelerating Deep Neural Networks implementation: A survey
    Dhouibi, Meriam
    Ben Salem, Ahmed Karim
    Saidi, Afef
    Ben Saoud, Slim
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2021, 15 (02): : 79 - 96
  • [34] Exploiting Deep Neural Networks for Digital Image Compression
    Hussain, Farhan
    Jeong, Jechang
    2015 2ND WORLD SYMPOSIUM ON WEB APPLICATIONS AND NETWORKING (WSWAN), 2015,
  • [35] Adaptive joint compression method for deep neural networks
    Yao B.
    Peng X.
    Yu X.
    Liu L.
    Peng Y.
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2023, 44 (05): : 21 - 32
  • [36] Compression of Deep Neural Networks for Image Instance Retrieval
    Chandrasekhar, Vijay
    Lin, Jie
    Liao, Qianli
    Morere, Olivier
    Veillard, Antoine
    Duan, Lingyu
    Poggio, Tomaso
    2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 300 - 309
  • [37] A Survey of Attacks and Defenses for Deep Neural Networks
    Machooka, Daniel
    Yuan, Xiaohong
    Esterline, Albert
    2023 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2023, : 254 - 261
  • [38] A Survey on Evolutionary Construction of Deep Neural Networks
    Zhou, Xun
    Qin, A. K.
    Gong, Maoguo
    Tan, Kay Chen
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2021, 25 (05) : 894 - 912
  • [39] Survey of scaling platforms for Deep Neural Networks
    Ratnaparkhi, Abhay A.
    Pilli, Emmanuel
    Joshi, R. C.
    2016 INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN COMMUNICATION TECHNOLOGIES (ETCT), 2016,
  • [40] DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
    Wiedemann, Simon
    Kirchhoffer, Heiner
    Matlage, Stefan
    Haase, Paul
    Marban, Arturo
    Marinc, Talmaj
    Neumann, David
    Nguyen, Tung
    Schwarz, Heiko
    Wiegand, Thomas
    Marpe, Detlev
    Samek, Wojciech
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (04) : 700 - 714