A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引：18

作者：

Cheng, Hongrong ^{[1
]}

Zhang, Miao ^{[2
]}

Shi, Javen Qinfeng ^{[1
]}

机构：

[1] Univ Adelaide, Adelaide, SA 5005, Australia

[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;

D O I：

10.1109/TPAMI.2024.3447085

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.

引用

页码：10558 / 10578

页数：21

共 50 条

[31] Group Pruning with Group Sparse Regularization for Deep Neural Network Compression
Wu, Chenglu
Pang, Wei
Liu, Hao
Lu, Shengli
2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 325 - 329
[32] Deep Neural Network Channel Pruning Compression Method for Filter Elasticity
Li, Ruiquan
Zhu, Lu
Liu, Yuanyuan
Computer Engineering and Applications, 2024, 60 (06) : 163 - 171
[33] Cooperative Pruning in Cross-Domain Deep Neural Network Compression
Chen, Shangyu
Wang, Wenya
Pan, Sinno Jialin
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2102 - 2108
[34] Structural Watermarking to Deep Neural Networks via Network Channel Pruning
Zhao, Xiangyu
Yao, Yinzhe
Wu, Hanzhou
Zhang, Xinpeng
2021 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2021, : 14 - 19
[35] PRF: deep neural network compression by systematic pruning of redundant filters
Sarvani, C.H.
Ghorai, Mrinmoy
Basha, S. H. Shabbeer
Neural Computing and Applications, 2024, 36 (33) : 20607 - 20616
[36] Deep neural networks compression: A comparative survey and choice recommendations
Marino, Giosue Cataldo
Petrini, Alessandro
Malchiodi, Dario
Frasca, Marco
NEUROCOMPUTING, 2023, 520 : 152 - 170
[37] Pruning Deep Neural Networks for Green Energy-Efficient Models: A Survey
Tmamna, Jihene
Ben Ayed, Emna
Fourati, Rahma
Gogate, Mandar
Arslan, Tughrul
Hussain, Amir
Ayed, Mounir Ben
COGNITIVE COMPUTATION, 2024, 16 (06) : 2931 - 2952
[38] A survey of Deep Neural Network watermarking techniques
Li, Yue
Wang, Hongxia
Barni, Mauro
NEUROCOMPUTING, 2021, 461 : 171 - 193
[39] Survey on Repair Strategies for Deep Neural Network
Liang Z.
Liu W.-W.
Wu T.-R.
Xue B.
Wang J.
Yang W.-J.
Ruan Jian Xue Bao/Journal of Software, 2024, 35 (03): : 1231 - 1256
[40] A survey of deep neural network architectures and their applications
Liu, Weibo
Wang, Zidong
Liu, Xiaohui
Zeng, Nianyin
Liu, Yurong
Alsaadi, Fuad E.
NEUROCOMPUTING, 2017, 234 : 11 - 26

← 1 2 3 4 5 →