A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引：18

作者：

Cheng, Hongrong ^{[1
]}

Zhang, Miao ^{[2
]}

Shi, Javen Qinfeng ^{[1
]}

机构：

[1] Univ Adelaide, Adelaide, SA 5005, Australia

[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;

D O I：

10.1109/TPAMI.2024.3447085

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.

引用

页码：10558 / 10578

页数：21

共 50 条

[1] Pruning and quantization for deep neural network acceleration: A survey
Liang, Tailin
Glossner, John
Wang, Lei
Shi, Shaobo
Zhang, Xiaotong
NEUROCOMPUTING, 2021, 461 : 370 - 403
[2] Convolutional Neural Network Pruning: A Survey
Xu, Sheng
Huang, Anran
Chen, Lei
Zhang, Baochang
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7458 - 7463
[3] Dimensionality reduced training by pruning and freezing parts of a deep neural network: a survey
Paul Wimmer
Jens Mehnert
Alexandru Paul Condurache
Artificial Intelligence Review, 2023, 56 : 14257 - 14295
[4] Dimensionality reduced training by pruning and freezing parts of a deep neural network: a survey
Wimmer, Paul
Mehnert, Jens
Condurache, Alexandru Paul
ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (12) : 14257 - 14295
[5] Pruning by explaining: A novel criterion for deep neural network pruning
Yeom, Seul-Ki
Seegerer, Philipp
Lapuschkin, Sebastian
Binder, Alexander
Wiedemann, Simon
Mueller, Klaus-Robert
Samek, Wojciech
PATTERN RECOGNITION, 2021, 115
[6] Pruning the deep neural network by similar function
Liu, Hanqing
Xin, Bo
Mu, Senlin
Zhu, Zhangqing
2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
[7] Automated Pruning for Deep Neural Network Compression
Manessi, Franco
Rozza, Alessandro
Bianco, Simone
Napoletano, Paolo
Schettini, Raimondo
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 657 - 664
[8] Overview of Deep Convolutional Neural Network Pruning
Li, Guang
Liu, Fang
Xia, Yuping
2020 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING AND ARTIFICIAL INTELLIGENCE, 2020, 11584
[9] Structured Pruning for Deep Convolutional Neural Networks: A Survey
He, Yang
Xiao, Lingao
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 2900 - 2919
[10] A Discriminant Information Approach to Deep Neural Network Pruning
Hou, Zejiang
Kung, Sun-Yuan
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9553 - 9560

← 1 2 3 4 5 →