Automatic Compression Ratio Allocation for Pruning Convolutional Neural Networks

被引:0
|
作者
Liu, Yunfeng [1 ]
Kong, Huihui [1 ]
Yu, Peihua [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Neural Networks; Network Pruning; Model Compression; Computer Vision;
D O I
10.1145/3387168.3387184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks (CNNs) have demonstrated significant performance improvement in many application scenarios. However, the high computational complexity and model size have limited its application on the mobile and embedded devices. Various approaches have been proposed to compress CNNs. Filter pruning is widely considered as a promising solution, which can significantly speed up the inference and reduce memory consumption. To this end, most approaches tend to prune filters by manually allocating compression ratio, which highly relies on individual expertise and not friendly to non-professional users. In this paper, we propose an Automatic Compression Ratio Allocation (ACRA) scheme based on binary search algorithm to prune convolutional neural networks. Specifically, ACRA provides two strategies for allocating compression ratio automatically. First, uniform pruning strategy allocates the same compression ratio to each layer, which is obtained by binary search based on target FLOPs reduction of the whole networks. Second, sensitivity-based pruning strategy allocates appropriate compression ratio to each layer based on the sensitivity to accuracy. Experimental results from VGG11 and VGG-16, demonstrate that our scheme can reduce FLOPs significantly while maintaining a high accuracy level. Specifically, for the VGG16 on CIFAR-10 dataset, we reduce 29.18% FLOPs with only 1.24% accuracy decrease.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Sparseness Ratio Allocation and Neuron Re-pruning for Neural Networks Compression
    Guo, Li
    Zhou, Dajiang
    Zhou, Jinjia
    Kimura, Shinji
    2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [2] Compression of Deep Convolutional Neural Networks Using Effective Channel Pruning
    Guo, Qingbei
    Wu, Xiao-Jun
    Zhao, Xiuyang
    IMAGE AND GRAPHICS, ICIG 2019, PT I, 2019, 11901 : 760 - 772
  • [3] Pruning Ratio Optimization with Layer-Wise Pruning Method for Accelerating Convolutional Neural Networks
    Kamma, Koji
    Inoue, Sarimu
    Wada, Toshikazu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (01) : 161 - 169
  • [4] Automatic Pruning for Quantized Neural Networks
    Guerra, Luis
    Drummond, Tom
    2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 290 - 297
  • [5] Variational Automatic Channel Pruning Algorithm Based on Structure Optimization for Convolutional Neural Networks
    Han, Shuo
    Zhan, Yufei
    Liu, Xingang
    JOURNAL OF INTERNET TECHNOLOGY, 2021, 22 (02): : 339 - 351
  • [6] Iterative clustering pruning for convolutional neural networks
    Chang, Jingfei
    Lu, Yang
    Xue, Ping
    Xu, Yiqun
    Wei, Zhen
    KNOWLEDGE-BASED SYSTEMS, 2023, 265
  • [7] Leveraging Structured Pruning of Convolutional Neural Networks
    Tessier, Hugo
    Gripon, Vincent
    Leonardon, Mathieu
    Arzel, Matthieu
    Bertrand, David
    Hannagan, Thomas
    2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 174 - 179
  • [8] Flattening Layer Pruning in Convolutional Neural Networks
    Jeczmionek, Ernest
    Kowalski, Piotr A.
    SYMMETRY-BASEL, 2021, 13 (07):
  • [9] Structured Pruning of Deep Convolutional Neural Networks
    Anwar, Sajid
    Hwang, Kyuyeon
    Sung, Wonyong
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
  • [10] Activation Pruning of Deep Convolutional Neural Networks
    Ardakani, Arash
    Condo, Carlo
    Gross, Warren J.
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1325 - 1329