Wide Hidden Expansion Layer for Deep Convolutional Neural Networks

被引：0

作者：

Wang, Min ^{[1
]}

Liu, Baoyuan ^{[2
]}

Foroosh, Hassan ^{[1
]}

机构：

[1] Univ Cent Florida, Orlando, FL 32816 USA

[2] Amazon, Seattle, WA USA

来源：

2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2020年

关键词：

D O I：

10.1109/wacv45572.2020.9093436

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Non-linearity is an essential factor contributing to the success of deep convolutional neural networks. Increasing the non-linearity in the network will enhance the network's learning capability, attributing to better performance. We present a novel Wide Hidden Expansion (WHE) layer that can significantly increase (by an order of magnitude) the number of activation functions in the network, with very little increase of computational complexity and memory consumption. It can be flexibly embedded with different network architectures to boost the performance of the original networks. The WHE layer is composed of a wide hidden layer, in which each channel only connects with two input channels and one output channel. Before connecting to the output channel, each intermediate channel in the WHE layer is followed by one activation function. In this manner, the number of activation functions can grow along with the number of channels in the hidden layer. We apply the WHE layer to ResNet, WideResNet, SENet, and MobileNet architectures and evaluate on ImageNet, CIFAR-100, and Tiny ImageNet dataset. On the ImageNet dataset, models with the WHE layer can achieve up to 2.01% higher Top-1 accuracy than baseline models, with less than 4% computation increase and less than 2% more parameters. On CIFAR-100 and Tiny ImageNet, when applying the WHE layer to ResNet models, it demonstrates consistent improvement in the accuracy of the networks. Applying the WHE layer to ResNet backbone of the CenterNet object detection model can also boost its performance on COCO and Pascal VOC datasets.

引用

页码：923 / 931

页数：9

共 50 条

[41] Theory of deep convolutional neural networks: Downsampling
Zhou, Ding-Xuan
NEURAL NETWORKS, 2020, 124 : 319 - 327
[42] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
[43] A new deep neural network based on a stack of single-hidden-layer feedforward neural networks with randomly fixed hidden neurons
Hu, Junying
Zhang, Jiangshe
Zhang, Chunxia
Wang, Juan
NEUROCOMPUTING, 2016, 171 : 63 - 72
[44] Structured Pruning of Deep Convolutional Neural Networks
Anwar, Sajid
Hwang, Kyuyeon
Sung, Wonyong
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[45] Deep convolutional neural networks in the face of caricature
Matthew Q. Hill
Connor J. Parde
Carlos D. Castillo
Y. Ivette Colón
Rajeev Ranjan
Jun-Cheng Chen
Volker Blanz
Alice J. O’Toole
Nature Machine Intelligence, 2019, 1 : 522 - 529
[46] Deep Convolutional Neural Networks on Cartoon Functions
Grohs, Philipp
Wiatowski, Thomas
Bolcskei, Helmut
2016 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2016, : 1163 - 1167
[47] Elastography mapped by deep convolutional neural networks
Liu, DongXu
Kruggel, Frithjof
Sun, LiZhi
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2021, 64 (07) : 1567 - 1574
[48] Very Deep Convolutional Neural Networks for LVCSR
Bi, Mengxiao
Qian, Yanmin
Yu, Kai
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3259 - 3263
[49] Elastography mapped by deep convolutional neural networks
LIU DongXu
KRUGGEL Frithjof
SUN LiZhi
Science China(Technological Sciences), 2021, 64 (07) : 1567 - 1574
[50] Universal Consistency of Deep Convolutional Neural Networks
Lin, Shao-Bo
Wang, Kaidong
Wang, Yao
Zhou, Ding-Xuan
IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (07) : 4610 - 4617

← 1 2 3 4 5 →