Training Very Deep Networks

被引:0
|
作者
Srivastava, Rupesh Kumar [1 ]
Greff, Klaus [1 ]
Schmidhuber, Juergen [1 ]
机构
[1] USI, SUPSI, Swiss AI Lab IDSIA, Lugano, Switzerland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and training of very deep networks remains an open problem. Here we introduce a new architecture designed to overcome this. Our so-called highway networks allow unimpeded information flow across many layers on information highways. They are inspired by Long Short-Term Memory recurrent networks and use adaptive gating units to regulate the information flow. Even with hundreds of layers, highway networks can be trained directly through simple gradient descent. This enables the study of extremely deep and efficient architectures.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Efficient Training of Very Deep Neural Networks for Supervised Hashing
    Zhang, Ziming
    Chen, Yuting
    Saligrama, Venkatesh
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1487 - 1495
  • [2] On the Importance of Network Architecture in Training Very Deep Neural Networks
    Chi, Zhizhen
    Li, Hongyang
    Wang, Jingjing
    Lu, Huchuan
    2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2016,
  • [3] Improved Highway Network Block for Training Very Deep Neural Networks
    Oyedotun, Oyebade K.
    Shabayek, Abd El Rahman
    Aouada, Djamila
    Ottersten, Bjorn
    IEEE ACCESS, 2020, 8 (08): : 176758 - 176773
  • [4] Training very deep neural networks: Rethinking the role of skip connections
    Oyedotun, Oyebade K.
    Al Ismaeil, Kassem
    Aouada, Djamila
    NEUROCOMPUTING, 2021, 441 : 105 - 117
  • [5] Revisiting the Training of Very Deep Neural Networks without Skip Connections
    Oyedotun, Oyebade K.
    Shabayek, Abd El Rahman
    Aouada, Djamila
    Ottersten, Bjorn
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2724 - 2731
  • [6] Highway Network Block with Gates Constraints for Training Very Deep Networks
    Oyedotun, Oyebade K.
    Shabayek, Abd El Rahman
    Aouada, Djamila
    Ottersten, Bjorn
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1739 - 1748
  • [7] Faster Training of Very Deep Networks Via p-Norm Gates
    Trang Pham
    Truyen Tran
    Dinh Phung
    Venkatesh, Svetha
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3542 - 3547
  • [8] LANGUAGE RECOGNITION USING DEEP NEURAL NETWORKS WITH VERY LIMITED TRAINING DATA
    Ranjan, Shivesh
    Yu, Chengzhu
    Zhang, Chunlei
    Kelly, Finnian
    Hansen, John H. L.
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5830 - 5834
  • [9] Training Very Deep Networks via Residual Learning with Stochastic Input Shortcut Connections
    Oyedotun, Oyebade K.
    Shabayek, Abd El Rahman
    Aouada, Djamila
    Ottersten, Bjoern
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 23 - 33
  • [10] Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets
    Vincent, Pascal
    de Brebisson, Alexandre
    Bouthillier, Xavier
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28