Automated Tensor Decomposition to Accelerate Convolutional Neural Networks

被引：0

作者：

Song B.-B. ^{[1
]}

Zhang H. ^{[2
,3
]}

Wu Z.-F. ^{[2
,3
]}

Liu J.-H. ^{[2
,3
]}

Liang Y. ^{[2
,3
]}

Zhou W. ^{[2
,3
]}

机构：

[1] School of Information Science and Engineering, Yunnan University, Kunming

[2] National Pilot School of Software, Yunnan University, Kunming

[3] Engineering Research Center of Cyberspace, Yunnan University, Kunming

来源：

Ruan Jian Xue Bao/Journal of Software | 2021年 / 32卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Automatic machine learning; Convolutional neural network; Neural network acceleration; Neural network compression; Tensor decomposition;

D O I：

10.13328/j.cnki.jos.006057

中图分类号：

学科分类号：

摘要：

Recently, convolutional neural network (CNN) have demonstrated strong performance and are widely used in many fields. Due to the large number of CNN parameters and high storage and computing power requirements, it is difficult to deploy on resource-constrained devices. Therefore, compression and acceleration of CNN models have become an urgent problem to be solved. With the research and development of automatic machine learning (AutoML), AutoML has profoundly impacted the development of neural networks. Inspired by this, this study proposes two automated accelerated CNN algorithms based on parameter estimation and genetic algorithms, which can calculate the optimal accelerated CNN model within a given accuracy loss range, effectively solving the error caused by artificially selected rank in tensor decomposition. It can effectively improve the compression and acceleration effects of the convolutional neural network. By rigorous testing on the MNIST and CIFAR-10 data sets, the accuracy rate on the MNIST dataset is slightly reduced by 0.35% compared to the original network, and the running time of the model is greatly reduced by 4.1 times, the accuracy rate on the CIFAR-10 dataset dropped slightly by 5.13%, and the running time of the model was greatly decreased by 0.8 times. © Copyright 2021, Institute of Software, the Chinese Academy of Sciences. All rights reserved.

引用

页码：3468 / 3481

页数：13

共 44 条

[21] Han S, Pool J, Tran J, Dally W., Learning both weights and connections for efficient neural network, Proc. of the 2015 MIT Press Conf. on Neural Information Processing Systems (NIPS), pp. 1135-1143, (2015)
[22] Lebedev V, Lempitsky V., Fast ConvNets using group-wise brain damage, Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2016), pp. 2554-2564, (2016)
[23] Molchanov P, Tyree S, Karras T, Aila T, Kautz J., Pruning convolutional neural networks for resource efficient inference, (2017)
[24] Gong Y, Liu L, Yang M, Bourdev L., Compressing deep convolutional networks using vector quantization, (2014)
[25] Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P., Deep learning with limited numerical precision, Proc. of the 32nd Int'l Conf. on Machine Learning (ICML), pp. 1737-1746, (2015)
[26] Li F, Zhang B, Liu B., Ternary weight networks, (2016)
[27] Courbariaux M, Bengio Y, David J., BinaryConnect: Training deep neural networks with binary weights during propagations, Proc. of the 2015 MIT Press Conf. on Neural Information Processing Systems (NIPS), pp. 3123-3131, (2015)
[28] Chollet F., Xception: Deep learning with depthwise separable convolutions, Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1800-1807, (2017)
[29] Zhang X, Zhou X, Lin M, Sun J., ShuffleNet: An extremely efficient convolutional neural network for mobile devices, Proc. of the 2018 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 6848-6856, (2018)
[30] Hinton G, Vinyals O, Dean J., Distilling the knowledge in a neural network, (2015)

← 1 2 3 4 5 →