A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引：18

作者：

Cheng, Hongrong ^{[1
]}

Zhang, Miao ^{[2
]}

Shi, Javen Qinfeng ^{[1
]}

机构：

[1] Univ Adelaide, Adelaide, SA 5005, Australia

[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;

D O I：

10.1109/TPAMI.2024.3447085

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.

引用

页码：10558 / 10578

页数：21

共 50 条

[21] Comparison Analysis for Pruning Algorithms of Neural Networks
Chen, Xi
Mao, Jincheng
Xie, Jian
2021 2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND INTELLIGENT CONTROL (ICCEIC 2021), 2021, : 50 - 56
[22] Automated Design of Deep Neural Networks: A Survey and Unified Taxonomy
Talbi, El-Ghazali
ACM COMPUTING SURVEYS, 2021, 54 (02)
[23] Lightweight Inference by Neural Network Pruning: Accuracy, Time and Comparison
Paralikas, Ilias
Spantideas, Sotiris
Giannopoulos, Anastasios
Trakadas, Panagiotis
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, PT III, AIAI 2024, 2024, 713 : 248 - 257
[24] A Pruning Neural Network Model in Credit Classification Analysis
Tang, Yajiao
Ji, Junkai
Gao, Shangce
Dai, Hongwei
Yu, Yang
Todo, Yuki
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2018, 2018 : 9390410
[25] RESHAPING DEEP NEURAL NETWORK FOR FAST DECODING BY NODE-PRUNING
He, Tianxing
Fan, Yuchen
Qian, Yanmin
Tan, Tian
Yu, Kai
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[26] A framework for deep neural network multiuser authorization based on channel pruning
Wang, Linna
Song, Yunfei
Zhu, Yujia
Xia, Daoxun
Han, Guoquan
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21):
[27] A New Pruning Algorithm for Neural Network Dimension Analysis
Sabo, Devin
Yu, Xiao-Hua
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3313 - 3318
[28] Fused Pruning based Robust Deep Neural Network Watermark Embedding
Li, Tengfei
Wang, Shuo
Jing, Huiyun
Lian, Zhichao
Meng, Shunmei
Li, Qianmu
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2475 - 2481
[29] Deep Neural Network Compression by In-Parallel Pruning-Quantization
Tung, Frederick
Mori, Greg
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (03) : 568 - 579
[30] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
Luo, Jian-Hao
Wu, Jianxin
Lin, Weiyao
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5068 - 5076

← 1 2 3 4 5 →