A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引:18
|
作者
Cheng, Hongrong [1 ]
Zhang, Miao [2 ]
Shi, Javen Qinfeng [1 ]
机构
[1] Univ Adelaide, Adelaide, SA 5005, Australia
[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;
D O I
10.1109/TPAMI.2024.3447085
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.
引用
收藏
页码:10558 / 10578
页数:21
相关论文
共 50 条
  • [1] Pruning and quantization for deep neural network acceleration: A survey
    Liang, Tailin
    Glossner, John
    Wang, Lei
    Shi, Shaobo
    Zhang, Xiaotong
    NEUROCOMPUTING, 2021, 461 : 370 - 403
  • [2] Convolutional Neural Network Pruning: A Survey
    Xu, Sheng
    Huang, Anran
    Chen, Lei
    Zhang, Baochang
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7458 - 7463
  • [3] Dimensionality reduced training by pruning and freezing parts of a deep neural network: a survey
    Paul Wimmer
    Jens Mehnert
    Alexandru Paul Condurache
    Artificial Intelligence Review, 2023, 56 : 14257 - 14295
  • [4] Dimensionality reduced training by pruning and freezing parts of a deep neural network: a survey
    Wimmer, Paul
    Mehnert, Jens
    Condurache, Alexandru Paul
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (12) : 14257 - 14295
  • [5] Pruning by explaining: A novel criterion for deep neural network pruning
    Yeom, Seul-Ki
    Seegerer, Philipp
    Lapuschkin, Sebastian
    Binder, Alexander
    Wiedemann, Simon
    Mueller, Klaus-Robert
    Samek, Wojciech
    PATTERN RECOGNITION, 2021, 115
  • [6] Pruning the deep neural network by similar function
    Liu, Hanqing
    Xin, Bo
    Mu, Senlin
    Zhu, Zhangqing
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [7] Automated Pruning for Deep Neural Network Compression
    Manessi, Franco
    Rozza, Alessandro
    Bianco, Simone
    Napoletano, Paolo
    Schettini, Raimondo
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 657 - 664
  • [8] Overview of Deep Convolutional Neural Network Pruning
    Li, Guang
    Liu, Fang
    Xia, Yuping
    2020 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING AND ARTIFICIAL INTELLIGENCE, 2020, 11584
  • [9] Structured Pruning for Deep Convolutional Neural Networks: A Survey
    He, Yang
    Xiao, Lingao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 2900 - 2919
  • [10] A Discriminant Information Approach to Deep Neural Network Pruning
    Hou, Zejiang
    Kung, Sun-Yuan
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9553 - 9560