A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

被引:18
|
作者
Cheng, Hongrong [1 ]
Zhang, Miao [2 ]
Shi, Javen Qinfeng [1 ]
机构
[1] Univ Adelaide, Adelaide, SA 5005, Australia
[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Neural networks; Artificial neural networks; Surveys; Taxonomy; Reviews; Computational modeling; Deep neural network pruning; model compression; model acceleration; large language models; vision transformers; large multimodal models; diffusion models; edge devices;
D O I
10.1109/TPAMI.2024.3447085
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.
引用
收藏
页码:10558 / 10578
页数:21
相关论文
共 50 条
  • [41] Neural network pruning with Tukey-Kramer multiple comparison procedure
    Duckro, DE
    Quinn, DW
    Gardner, SJ
    NEURAL COMPUTATION, 2002, 14 (05) : 1149 - 1168
  • [42] A Deep Neural Network for Crossing-City POI Recommendations
    Li, Dichao
    Gong, Zhiguo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3536 - 3548
  • [43] Deep neural network pruning method based on sensitive layers and reinforcement learning
    Yang, Wenchuan
    Yu, Haoran
    Cui, Baojiang
    Sui, Runqi
    Gu, Tianyu
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 2) : 1897 - 1917
  • [44] An FPGA Realization of a Deep Convolutional Neural Network Using a Threshold Neuron Pruning
    Fujii, Tomoya
    Sato, Simpei
    Nakahara, Hiroki
    Motomura, Masato
    APPLIED RECONFIGURABLE COMPUTING, 2017, 10216 : 268 - 280
  • [45] Pruning Deep Neural Network Models via Minimax Concave Penalty Regression
    Liu, Xinggu
    Zhou, Lin
    Luo, Youxi
    APPLIED SCIENCES-BASEL, 2024, 14 (09):
  • [46] Explainable online ensemble of deep neural network pruning for time series forecasting
    Saadallah, Amal
    Jakobs, Matthias
    Morik, Katharina
    MACHINE LEARNING, 2022, 111 (09) : 3459 - 3487
  • [47] Deep neural network compression through interpretability-based filter pruning
    Yao, Kaixuan
    Cao, Feilong
    Leung, Yee
    Liang, Jiye
    PATTERN RECOGNITION, 2021, 119
  • [48] Explainable online ensemble of deep neural network pruning for time series forecasting
    Amal Saadallah
    Matthias Jakobs
    Katharina Morik
    Machine Learning, 2022, 111 : 3459 - 3487
  • [49] An efficient pruning and fine-tuning method for deep spiking neural network
    L. W. Meng
    G. C. Qiao
    X. Y. Zhang
    J. Bai
    Y. Zuo
    P. J. Zhou
    Y. Liu
    S. G. Hu
    Applied Intelligence, 2023, 53 : 28910 - 28923
  • [50] Absorption Pruning of Deep Neural Network for Object Detection in Remote Sensing Imagery
    Wang, Jielei
    Cui, Zongyong
    Zang, Zhipeng
    Meng, Xiangjie
    Cao, Zongjie
    REMOTE SENSING, 2022, 14 (24)