A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引：18

作者：

Cheng, Hongrong ^{[1
]}

Zhang, Miao ^{[2
]}

Shi, Javen Qinfeng ^{[1
]}

机构：

[1] Univ Adelaide, Adelaide, SA 5005, Australia

[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;

D O I：

10.1109/TPAMI.2024.3447085

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.

引用

页码：10558 / 10578

页数：21

共 50 条

[41] Neural network pruning with Tukey-Kramer multiple comparison procedure
Duckro, DE
Quinn, DW
Gardner, SJ
NEURAL COMPUTATION, 2002, 14 (05) : 1149 - 1168
[42] A Deep Neural Network for Crossing-City POI Recommendations
Li, Dichao
Gong, Zhiguo
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3536 - 3548
[43] Deep neural network pruning method based on sensitive layers and reinforcement learning
Yang, Wenchuan
Yu, Haoran
Cui, Baojiang
Sui, Runqi
Gu, Tianyu
ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 2) : 1897 - 1917
[44] An FPGA Realization of a Deep Convolutional Neural Network Using a Threshold Neuron Pruning
Fujii, Tomoya
Sato, Simpei
Nakahara, Hiroki
Motomura, Masato
APPLIED RECONFIGURABLE COMPUTING, 2017, 10216 : 268 - 280
[45] Pruning Deep Neural Network Models via Minimax Concave Penalty Regression
Liu, Xinggu
Zhou, Lin
Luo, Youxi
APPLIED SCIENCES-BASEL, 2024, 14 (09):
[46] Explainable online ensemble of deep neural network pruning for time series forecasting
Saadallah, Amal
Jakobs, Matthias
Morik, Katharina
MACHINE LEARNING, 2022, 111 (09) : 3459 - 3487
[47] Deep neural network compression through interpretability-based filter pruning
Yao, Kaixuan
Cao, Feilong
Leung, Yee
Liang, Jiye
PATTERN RECOGNITION, 2021, 119
[48] Explainable online ensemble of deep neural network pruning for time series forecasting
Amal Saadallah
Matthias Jakobs
Katharina Morik
Machine Learning, 2022, 111 : 3459 - 3487
[49] An efficient pruning and fine-tuning method for deep spiking neural network
L. W. Meng
G. C. Qiao
X. Y. Zhang
J. Bai
Y. Zuo
P. J. Zhou
Y. Liu
S. G. Hu
Applied Intelligence, 2023, 53 : 28910 - 28923
[50] Absorption Pruning of Deep Neural Network for Object Detection in Remote Sensing Imagery
Wang, Jielei
Cui, Zongyong
Zang, Zhipeng
Meng, Xiangjie
Cao, Zongjie
REMOTE SENSING, 2022, 14 (24)

← 1 2 3 4 5 →