A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引:18
|
作者
Cheng, Hongrong [1 ]
Zhang, Miao [2 ]
Shi, Javen Qinfeng [1 ]
机构
[1] Univ Adelaide, Adelaide, SA 5005, Australia
[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;
D O I
10.1109/TPAMI.2024.3447085
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.
引用
收藏
页码:10558 / 10578
页数:21
相关论文
共 50 条
  • [21] Comparison Analysis for Pruning Algorithms of Neural Networks
    Chen, Xi
    Mao, Jincheng
    Xie, Jian
    2021 2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND INTELLIGENT CONTROL (ICCEIC 2021), 2021, : 50 - 56
  • [22] Automated Design of Deep Neural Networks: A Survey and Unified Taxonomy
    Talbi, El-Ghazali
    ACM COMPUTING SURVEYS, 2021, 54 (02)
  • [23] Lightweight Inference by Neural Network Pruning: Accuracy, Time and Comparison
    Paralikas, Ilias
    Spantideas, Sotiris
    Giannopoulos, Anastasios
    Trakadas, Panagiotis
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, PT III, AIAI 2024, 2024, 713 : 248 - 257
  • [24] A Pruning Neural Network Model in Credit Classification Analysis
    Tang, Yajiao
    Ji, Junkai
    Gao, Shangce
    Dai, Hongwei
    Yu, Yang
    Todo, Yuki
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2018, 2018 : 9390410
  • [25] RESHAPING DEEP NEURAL NETWORK FOR FAST DECODING BY NODE-PRUNING
    He, Tianxing
    Fan, Yuchen
    Qian, Yanmin
    Tan, Tian
    Yu, Kai
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [26] A framework for deep neural network multiuser authorization based on channel pruning
    Wang, Linna
    Song, Yunfei
    Zhu, Yujia
    Xia, Daoxun
    Han, Guoquan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21):
  • [27] A New Pruning Algorithm for Neural Network Dimension Analysis
    Sabo, Devin
    Yu, Xiao-Hua
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3313 - 3318
  • [28] Fused Pruning based Robust Deep Neural Network Watermark Embedding
    Li, Tengfei
    Wang, Shuo
    Jing, Huiyun
    Lian, Zhichao
    Meng, Shunmei
    Li, Qianmu
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2475 - 2481
  • [29] Deep Neural Network Compression by In-Parallel Pruning-Quantization
    Tung, Frederick
    Mori, Greg
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (03) : 568 - 579
  • [30] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
    Luo, Jian-Hao
    Wu, Jianxin
    Lin, Weiyao
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5068 - 5076