Enhanced gradient learning for deep neural networks

被引:0
|
作者
Yan, Ming [2 ]
Yang, Jianxi [1 ]
Chen, Cen [3 ]
Zhou, Joey Tianyi [2 ]
Pan, Yi [4 ]
Zeng, Zeng [3 ]
机构
[1] Chongqing Jiaotong Univ, AI Res Ctr, Sch Informat Sci & Engn, Chongqing, Peoples R China
[2] Age Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore
[3] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore, Singapore
[4] Chinese Acad Sci, Shenzhen Inst Adv Technol, Beijing, Peoples R China
关键词
Circuit connections - Deep layer - Gradient flow - Gradient learning - Images processing - Large margins - Neural-networks - Shallowest layers - Training parameters - Transport systems;
D O I
10.1049/ipr2.12353
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks have achieved great success in both computer vision and natural language processing tasks. How to improve the gradient flows is crucial in training very deep neural networks. To address this challenge, a gradient enhancement approach is proposed through constructing the short circuit neural connections. The proposed short circuit is a unidirectional neural connection that back propagates the sensitivities rather than gradients in neural networks from the deep layers to the shallow layers. Moreover, the short circuit is further formulated as a gradient truncation operation in its connecting layers, which can be plugged into the backbone models without introducing extra training parameters. Extensive experiments demonstrate that the deep neural networks, with the help of short circuit connection, gain a large margin of improvement over the baselines on both computer vision and natural language processing tasks. The work provides the promising solution to the low-resource scenarios, such as, intelligence transport systems of computer vision, question answering of natural language processing.
引用
收藏
页码:365 / 377
页数:13
相关论文
共 50 条
  • [1] Learning dynamics of gradient descent optimization in deep neural networks
    Wu, Wei
    Jing, Xiaoyuan
    Du, Wencai
    Chen, Guoliang
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)
  • [2] GRADUAL SURROGATE GRADIENT LEARNING IN DEEP SPIKING NEURAL NETWORKS
    Chen, Yi
    Zhang, Silin
    Ren, Shiyu
    Qu, Hong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8927 - 8931
  • [3] Learning dynamics of gradient descent optimization in deep neural networks
    Wei WU
    Xiaoyuan JING
    Wencai DU
    Guoliang CHEN
    ScienceChina(InformationSciences), 2021, 64 (05) : 17 - 31
  • [4] Learning dynamics of gradient descent optimization in deep neural networks
    Wei Wu
    Xiaoyuan Jing
    Wencai Du
    Guoliang Chen
    Science China Information Sciences, 2021, 64
  • [5] DMAdam: Dual averaging enhanced adaptive gradient method for deep neural networks
    Jiang, Wenhan
    Liu, Jinlan
    Zhang, Naimin
    Xu, Dongpo
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [6] A Review on Community Detection Using Deep Neural Networks with Enhanced Learning
    Sikarwar, Ranjana
    Singh, Shashank Sheshar
    Shakya, Harish Kumar
    INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, ICICC 2022, VOL 1, 2023, 473 : 179 - 187
  • [7] Weight and Gradient Centralization in Deep Neural Networks
    Fuhl, Wolfgang
    Kasneci, Enkelejda
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 227 - 239
  • [8] Impact of Mathematical Norms on Convergence of Gradient Descent Algorithms for Deep Neural Networks Learning
    Cai, Linzhe
    Yu, Xinghuo
    Li, Chaojie
    Eberhard, Andrew
    Lien Thuy Nguyen
    Chuong Thai Doan
    AI 2022: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13728 : 131 - 144
  • [9] Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers
    Bah, Bubacarr
    Rauhut, Holger
    Terstiege, Ulrich
    Westdickenberg, Michael
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2022, 11 (01) : 307 - 353
  • [10] Protein Loop Modeling using Deep Neural Networks Enhanced by Reinforcement Learning
    Pan, Feng
    Zhang, Yuan
    Lo, Chun-Chao
    Liu, Xiuwen
    Zhang, Jinfeng
    BIOPHYSICAL JOURNAL, 2020, 118 (03) : 43A - 43A