Wide Hidden Expansion Layer for Deep Convolutional Neural Networks

被引:0
|
作者
Wang, Min [1 ]
Liu, Baoyuan [2 ]
Foroosh, Hassan [1 ]
机构
[1] Univ Cent Florida, Orlando, FL 32816 USA
[2] Amazon, Seattle, WA USA
关键词
D O I
10.1109/wacv45572.2020.9093436
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-linearity is an essential factor contributing to the success of deep convolutional neural networks. Increasing the non-linearity in the network will enhance the network's learning capability, attributing to better performance. We present a novel Wide Hidden Expansion (WHE) layer that can significantly increase (by an order of magnitude) the number of activation functions in the network, with very little increase of computational complexity and memory consumption. It can be flexibly embedded with different network architectures to boost the performance of the original networks. The WHE layer is composed of a wide hidden layer, in which each channel only connects with two input channels and one output channel. Before connecting to the output channel, each intermediate channel in the WHE layer is followed by one activation function. In this manner, the number of activation functions can grow along with the number of channels in the hidden layer. We apply the WHE layer to ResNet, WideResNet, SENet, and MobileNet architectures and evaluate on ImageNet, CIFAR-100, and Tiny ImageNet dataset. On the ImageNet dataset, models with the WHE layer can achieve up to 2.01% higher Top-1 accuracy than baseline models, with less than 4% computation increase and less than 2% more parameters. On CIFAR-100 and Tiny ImageNet, when applying the WHE layer to ResNet models, it demonstrates consistent improvement in the accuracy of the networks. Applying the WHE layer to ResNet backbone of the CenterNet object detection model can also boost its performance on COCO and Pascal VOC datasets.
引用
收藏
页码:923 / 931
页数:9
相关论文
共 50 条
  • [41] Theory of deep convolutional neural networks: Downsampling
    Zhou, Ding-Xuan
    NEURAL NETWORKS, 2020, 124 : 319 - 327
  • [42] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [43] A new deep neural network based on a stack of single-hidden-layer feedforward neural networks with randomly fixed hidden neurons
    Hu, Junying
    Zhang, Jiangshe
    Zhang, Chunxia
    Wang, Juan
    NEUROCOMPUTING, 2016, 171 : 63 - 72
  • [44] Structured Pruning of Deep Convolutional Neural Networks
    Anwar, Sajid
    Hwang, Kyuyeon
    Sung, Wonyong
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
  • [45] Deep convolutional neural networks in the face of caricature
    Matthew Q. Hill
    Connor J. Parde
    Carlos D. Castillo
    Y. Ivette Colón
    Rajeev Ranjan
    Jun-Cheng Chen
    Volker Blanz
    Alice J. O’Toole
    Nature Machine Intelligence, 2019, 1 : 522 - 529
  • [46] Deep Convolutional Neural Networks on Cartoon Functions
    Grohs, Philipp
    Wiatowski, Thomas
    Bolcskei, Helmut
    2016 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2016, : 1163 - 1167
  • [47] Elastography mapped by deep convolutional neural networks
    Liu, DongXu
    Kruggel, Frithjof
    Sun, LiZhi
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2021, 64 (07) : 1567 - 1574
  • [48] Very Deep Convolutional Neural Networks for LVCSR
    Bi, Mengxiao
    Qian, Yanmin
    Yu, Kai
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3259 - 3263
  • [49] Elastography mapped by deep convolutional neural networks
    LIU DongXu
    KRUGGEL Frithjof
    SUN LiZhi
    Science China(Technological Sciences), 2021, 64 (07) : 1567 - 1574
  • [50] Universal Consistency of Deep Convolutional Neural Networks
    Lin, Shao-Bo
    Wang, Kaidong
    Wang, Yao
    Zhou, Ding-Xuan
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (07) : 4610 - 4617