Automatic Compression Ratio Allocation for Pruning Convolutional Neural Networks

被引：0

作者：

Liu, Yunfeng ^{[1
]}

Kong, Huihui ^{[1
]}

Yu, Peihua ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

来源：

ICVISP 2019: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING | 2019年

基金：

中国国家自然科学基金;

关键词：

Neural Networks; Network Pruning; Model Compression; Computer Vision;

D O I：

10.1145/3387168.3387184

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional neural networks (CNNs) have demonstrated significant performance improvement in many application scenarios. However, the high computational complexity and model size have limited its application on the mobile and embedded devices. Various approaches have been proposed to compress CNNs. Filter pruning is widely considered as a promising solution, which can significantly speed up the inference and reduce memory consumption. To this end, most approaches tend to prune filters by manually allocating compression ratio, which highly relies on individual expertise and not friendly to non-professional users. In this paper, we propose an Automatic Compression Ratio Allocation (ACRA) scheme based on binary search algorithm to prune convolutional neural networks. Specifically, ACRA provides two strategies for allocating compression ratio automatically. First, uniform pruning strategy allocates the same compression ratio to each layer, which is obtained by binary search based on target FLOPs reduction of the whole networks. Second, sensitivity-based pruning strategy allocates appropriate compression ratio to each layer based on the sensitivity to accuracy. Experimental results from VGG11 and VGG-16, demonstrate that our scheme can reduce FLOPs significantly while maintaining a high accuracy level. Specifically, for the VGG16 on CIFAR-10 dataset, we reduce 29.18% FLOPs with only 1.24% accuracy decrease.

引用

页数：6

共 50 条

[1] Sparseness Ratio Allocation and Neuron Re-pruning for Neural Networks Compression
Guo, Li
Zhou, Dajiang
Zhou, Jinjia
Kimura, Shinji
2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
[2] Compression of Deep Convolutional Neural Networks Using Effective Channel Pruning
Guo, Qingbei
Wu, Xiao-Jun
Zhao, Xiuyang
IMAGE AND GRAPHICS, ICIG 2019, PT I, 2019, 11901 : 760 - 772
[3] Pruning Ratio Optimization with Layer-Wise Pruning Method for Accelerating Convolutional Neural Networks
Kamma, Koji
Inoue, Sarimu
Wada, Toshikazu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (01) : 161 - 169
[4] Automatic Pruning for Quantized Neural Networks
Guerra, Luis
Drummond, Tom
2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 290 - 297
[5] Variational Automatic Channel Pruning Algorithm Based on Structure Optimization for Convolutional Neural Networks
Han, Shuo
Zhan, Yufei
Liu, Xingang
JOURNAL OF INTERNET TECHNOLOGY, 2021, 22 (02): : 339 - 351
[6] Iterative clustering pruning for convolutional neural networks
Chang, Jingfei
Lu, Yang
Xue, Ping
Xu, Yiqun
Wei, Zhen
KNOWLEDGE-BASED SYSTEMS, 2023, 265
[7] Leveraging Structured Pruning of Convolutional Neural Networks
Tessier, Hugo
Gripon, Vincent
Leonardon, Mathieu
Arzel, Matthieu
Bertrand, David
Hannagan, Thomas
2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 174 - 179
[8] Flattening Layer Pruning in Convolutional Neural Networks
Jeczmionek, Ernest
Kowalski, Piotr A.
SYMMETRY-BASEL, 2021, 13 (07):
[9] Structured Pruning of Deep Convolutional Neural Networks
Anwar, Sajid
Hwang, Kyuyeon
Sung, Wonyong
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[10] Activation Pruning of Deep Convolutional Neural Networks
Ardakani, Arash
Condo, Carlo
Gross, Warren J.
2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1325 - 1329

← 1 2 3 4 5 →