Enhanced gradient learning for deep neural networks

被引:0
|
作者
Yan, Ming [2 ]
Yang, Jianxi [1 ]
Chen, Cen [3 ]
Zhou, Joey Tianyi [2 ]
Pan, Yi [4 ]
Zeng, Zeng [3 ]
机构
[1] Chongqing Jiaotong Univ, AI Res Ctr, Sch Informat Sci & Engn, Chongqing, Peoples R China
[2] Age Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore
[3] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore, Singapore
[4] Chinese Acad Sci, Shenzhen Inst Adv Technol, Beijing, Peoples R China
关键词
Circuit connections - Deep layer - Gradient flow - Gradient learning - Images processing - Large margins - Neural-networks - Shallowest layers - Training parameters - Transport systems;
D O I
10.1049/ipr2.12353
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks have achieved great success in both computer vision and natural language processing tasks. How to improve the gradient flows is crucial in training very deep neural networks. To address this challenge, a gradient enhancement approach is proposed through constructing the short circuit neural connections. The proposed short circuit is a unidirectional neural connection that back propagates the sensitivities rather than gradients in neural networks from the deep layers to the shallow layers. Moreover, the short circuit is further formulated as a gradient truncation operation in its connecting layers, which can be plugged into the backbone models without introducing extra training parameters. Extensive experiments demonstrate that the deep neural networks, with the help of short circuit connection, gain a large margin of improvement over the baselines on both computer vision and natural language processing tasks. The work provides the promising solution to the low-resource scenarios, such as, intelligence transport systems of computer vision, question answering of natural language processing.
引用
收藏
页码:365 / 377
页数:13
相关论文
共 50 条
  • [21] Fast learning in Deep Neural Networks
    Chandra, B.
    Sharma, Rajesh K.
    NEUROCOMPUTING, 2016, 171 : 1205 - 1215
  • [22] Deep associative learning for neural networks
    Liu, Jia
    Zhang, Wenhua
    Liu, Fang
    Xiao, Liang
    NEUROCOMPUTING, 2021, 443 (443) : 222 - 234
  • [23] Collaborative Learning for Deep Neural Networks
    Song, Guocong
    Chai, Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [24] Big learning and deep neural networks
    Montavon, Grégoire
    Müller, Klaus-Robert
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 419 - 420
  • [25] Multiplierless Neural Networks for Deep Learning
    Banduka, Maja Lutovac
    Lutovac, Miroslav
    2024 13TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING, MECO 2024, 2024, : 262 - 265
  • [26] Shortcut learning in deep neural networks
    Geirhos, Robert
    Jacobsen, Joern-Henrik
    Michaelis, Claudio
    Zemel, Richard
    Brendel, Wieland
    Bethge, Matthias
    Wichmann, Felix A.
    NATURE MACHINE INTELLIGENCE, 2020, 2 (11) : 665 - 673
  • [27] A Hessian-Free Gradient Flow (HFGF) method for the optimisation of deep learning neural networks
    Zhang, Sushen
    Chen, Ruijuan
    Du, Wenyu
    Yuan, Ye
    Vassiliadis, Vassilios S.
    COMPUTERS & CHEMICAL ENGINEERING, 2020, 141
  • [28] A Novel Learning Algorithm to Optimize Deep Neural Networks: Evolved Gradient Direction Optimizer (EVGO)
    Karabayir, Ibrahim
    Akbilgic, Oguz
    Tas, Nihat
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) : 685 - 694
  • [29] Deep Learning Neural Networks and Bayesian Neural Networks in Data Analysis
    Chernoded, Andrey
    Dudko, Lev
    Myagkov, Igor
    Volkov, Petr
    XXIII INTERNATIONAL WORKSHOP HIGH ENERGY PHYSICS AND QUANTUM FIELD THEORY (QFTHEP 2017), 2017, 158
  • [30] Enhanced Skin Cancer Diagnosis via Deep Convolutional Neural Networks with Ensemble Learning
    Mohd Anas Khan
    Shahzad Alam
    Waseem Ahmed
    SN Computer Science, 6 (2)