A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引:18
|
作者
Cheng, Hongrong [1 ]
Zhang, Miao [2 ]
Shi, Javen Qinfeng [1 ]
机构
[1] Univ Adelaide, Adelaide, SA 5005, Australia
[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;
D O I
10.1109/TPAMI.2024.3447085
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.
引用
收藏
页码:10558 / 10578
页数:21
相关论文
共 50 条
  • [31] Group Pruning with Group Sparse Regularization for Deep Neural Network Compression
    Wu, Chenglu
    Pang, Wei
    Liu, Hao
    Lu, Shengli
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 325 - 329
  • [32] Deep Neural Network Channel Pruning Compression Method for Filter Elasticity
    Li, Ruiquan
    Zhu, Lu
    Liu, Yuanyuan
    Computer Engineering and Applications, 2024, 60 (06) : 163 - 171
  • [33] Cooperative Pruning in Cross-Domain Deep Neural Network Compression
    Chen, Shangyu
    Wang, Wenya
    Pan, Sinno Jialin
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2102 - 2108
  • [34] Structural Watermarking to Deep Neural Networks via Network Channel Pruning
    Zhao, Xiangyu
    Yao, Yinzhe
    Wu, Hanzhou
    Zhang, Xinpeng
    2021 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2021, : 14 - 19
  • [35] PRF: deep neural network compression by systematic pruning of redundant filters
    Sarvani, C.H.
    Ghorai, Mrinmoy
    Basha, S. H. Shabbeer
    Neural Computing and Applications, 2024, 36 (33) : 20607 - 20616
  • [36] Deep neural networks compression: A comparative survey and choice recommendations
    Marino, Giosue Cataldo
    Petrini, Alessandro
    Malchiodi, Dario
    Frasca, Marco
    NEUROCOMPUTING, 2023, 520 : 152 - 170
  • [37] Pruning Deep Neural Networks for Green Energy-Efficient Models: A Survey
    Tmamna, Jihene
    Ben Ayed, Emna
    Fourati, Rahma
    Gogate, Mandar
    Arslan, Tughrul
    Hussain, Amir
    Ayed, Mounir Ben
    COGNITIVE COMPUTATION, 2024, 16 (06) : 2931 - 2952
  • [38] A survey of Deep Neural Network watermarking techniques
    Li, Yue
    Wang, Hongxia
    Barni, Mauro
    NEUROCOMPUTING, 2021, 461 : 171 - 193
  • [39] Survey on Repair Strategies for Deep Neural Network
    Liang Z.
    Liu W.-W.
    Wu T.-R.
    Xue B.
    Wang J.
    Yang W.-J.
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (03): : 1231 - 1256
  • [40] A survey of deep neural network architectures and their applications
    Liu, Weibo
    Wang, Zidong
    Liu, Xiaohui
    Zeng, Nianyin
    Liu, Yurong
    Alsaadi, Fuad E.
    NEUROCOMPUTING, 2017, 234 : 11 - 26