Optimizing nonlinear activation function for convolutional neural networks

被引:37
|
作者
Varshney, Munender [1 ]
Singh, Pravendra [1 ]
机构
[1] Indian Inst Technol Kanpur, Dept Comp Sci & Engn, Kanpur, Uttar Pradesh, India
关键词
FReLU; ReLU; CNN; Convolutional neural network; Activation function;
D O I
10.1007/s11760-021-01863-z
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Activation functions play a critical role in the training and performance of the deep convolutional neural networks. Currently, the rectified linear unit (ReLU) is the most commonly used activation function for the deep CNNs. ReLU is a piecewise linear function that will output the input directly if it is positive, otherwise, it will output zero. In this work, we propose a novel approach to generalize the ReLU activation function using multiple learnable slope parameters. These learnable slope parameters are optimized for every channel, which leads to the learning of a more generalized activation function (a variant of ReLU) corresponding to each channel. This activation is named as fully parametric rectified linear unit (FReLU) and trained using an alternate optimization technique by learning one set of parameters, keeping another set of parameters frozen. Our experiments show that the method outperforms ReLU and its other variant activation functions and also generalizes over various tasks such as image classification, object detection and action recognition in videos. The Top-1 classification accuracy of FReLU on ImageNet improves by 3.75% for MobileNet and similar to 2% for ResNet-50 over ReLU. We also provide various analyses for better interpretability of our proposed activation function.
引用
收藏
页码:1323 / 1330
页数:8
相关论文
共 50 条
  • [1] Optimizing nonlinear activation function for convolutional neural networks
    Munender Varshney
    Pravendra Singh
    Signal, Image and Video Processing, 2021, 15 : 1323 - 1330
  • [2] Pansharpening Techniques: Optimizing the Loss Function for Convolutional Neural Networks
    Restaino, Rocco
    REMOTE SENSING, 2025, 17 (01)
  • [3] Algorithm Research on Improving Activation Function of Convolutional Neural Networks
    Guo, Yanhua
    Sun, Lei
    Zhang, Zhihong
    He, Hong
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 3582 - 3586
  • [4] Optimizing performance of feedforward and convolutional neural networks through dynamic activation functions
    Rane, Chinmay
    Tyagi, Kanishka
    Kline, Adrienne
    Chugh, Tushar
    Manry, Michael
    EVOLUTIONARY INTELLIGENCE, 2024, 17 (5-6) : 4083 - 4093
  • [5] RSigELU: A nonlinear activation function for deep neural networks
    Kilicarslan, Serhat
    Celik, Mete
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174 (174)
  • [6] Logish: A new nonlinear nonmonotonic activation function for convolutional neural network
    Zhu, Hegui
    Zeng, Huimin
    Liu, Jinhai
    Zhang, Xiangde
    NEUROCOMPUTING, 2021, 458 : 490 - 499
  • [7] SinP[N]: A Fast Convergence Activation Function for Convolutional Neural Networks
    Chan, Ka-Hou
    Im, Sio-Kei
    Ke, Wei
    Lei, Ngan-Lin
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING COMPANION (UCC COMPANION), 2018, : 365 - 369
  • [8] Efficient Neural Networks on the Edge with FPGAs by Optimizing an Adaptive Activation Function
    Jiang, Yiyue
    Vaicaitis, Andrius
    Dooley, John
    Leeser, Miriam
    SENSORS, 2024, 24 (06)
  • [9] ReAFM: A Reconfigurable Nonlinear Activation Function Module for Neural Networks
    Wu, Xiao
    Liang, Shuang
    Wang, Meiqi
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (07) : 2660 - 2664
  • [10] An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks
    Chai, Enhui
    Yu, Wei
    Cui, Tianxiang
    Ren, Jianfeng
    Ding, Shusheng
    SYMMETRY-BASEL, 2022, 14 (05):