Deep limits of residual neural networks

被引:8
|
作者
Thorpe, Matthew [1 ,2 ]
van Gennip, Yves [3 ]
机构
[1] Univ Manchester, Dept Math, Manchester M13 9PL, England
[2] Alan Turing Inst, London NW1 2DB, England
[3] Delft Univ Technol, Delft Inst Appl Math, NL-2628 CD Delft, Netherlands
基金
欧洲研究理事会;
关键词
Deep neural networks; Ordinary differential equations; Deep layer limits; Variational convergence; Gamma-convergence; Regularity;
D O I
10.1007/s40687-022-00370-y
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Neural networks have been very successful in many applications; we often, however, lack a theoretical understanding of what the neural networks are actually learning. This problem emerges when trying to generalise to new data sets. The contribution of this paper is to show that, for the residual neural network model, the deep layer limit coincides with a parameter estimation problem for a nonlinear ordinary differential equation. In particular, whilst it is known that the residual neural network model is a discretisation of an ordinary differential equation, we show convergence in a variational sense. This implies that optimal parameters converge in the deep layer limit. This is a stronger statement than saying for a fixed parameter the residual neural network model converges (the latter does not in general imply the former). Our variational analysis provides a discrete-to-continuum F-convergence result for the objective function of the residual neural network training step to a variational problem constrained by a system of ordinary differential equations; this rigorously connects the discrete setting to a continuum problem.
引用
收藏
页数:44
相关论文
共 50 条
  • [31] Residual SDE-Net for uncertainty estimates of deep neural networks
    Wang Y.
    Yao S.
    Tan H.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2023, 49 (08): : 1991 - 2000
  • [32] Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts
    Li, Guilin
    Zhang, Junlei
    Wang, Yunhe
    Liu, Chuanjian
    Tan, Matthias
    Lin, Yunfeng
    Zhang, Wei
    Feng, Jiashi
    Zhang, Tong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [33] Automatic Concurrent Arrhythmia Classification Using Deep Residual Neural Networks
    Nankani, Deepankar
    Saikia, Pallabi
    Baruah, Rashmi Dutta
    2020 COMPUTING IN CARDIOLOGY, 2020,
  • [34] A Neuronal Morphology Classification Approach Based on Deep Residual Neural Networks
    Lin, Xianghong
    Zheng, Jianyang
    Wang, Xiangwen
    Ma, Huifang
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT IV, 2018, 11304 : 336 - 348
  • [35] Learning deep hierarchical and temporal recurrent neural networks with residual learning
    Zia, Tehseen
    Abbas, Assad
    Habib, Usman
    Khan, Muhammad Sajid
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (04) : 873 - 882
  • [36] Deep Residual Neural Networks for Image in Audio Steganography (Workshop Paper)
    Agarwal, Shivam
    Venkatraman, Siddarth
    2020 IEEE SIXTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2020), 2020, : 430 - 434
  • [37] Wide deep residual networks in networks
    Hmidi Alaeddine
    Malek Jihene
    Multimedia Tools and Applications, 2023, 82 : 7889 - 7899
  • [38] Wide deep residual networks in networks
    Alaeddine, Hmidi
    Jihene, Malek
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7889 - 7899
  • [39] The Limits of SEMA on Distinguishing Similar Activation Functions of Embedded Deep Neural Networks
    Takatoi, Go
    Sugawara, Takeshi
    Sakiyama, Kazuo
    Hara-Azumi, Yuko
    Li, Yang
    APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [40] Distributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability
    Keuper, Janis
    Preundt, Franz-Josef
    PROCEEDINGS OF 2016 2ND WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC), 2016, : 19 - 26