Deep limits of residual neural networks

被引:8
|
作者
Thorpe, Matthew [1 ,2 ]
van Gennip, Yves [3 ]
机构
[1] Univ Manchester, Dept Math, Manchester M13 9PL, England
[2] Alan Turing Inst, London NW1 2DB, England
[3] Delft Univ Technol, Delft Inst Appl Math, NL-2628 CD Delft, Netherlands
基金
欧洲研究理事会;
关键词
Deep neural networks; Ordinary differential equations; Deep layer limits; Variational convergence; Gamma-convergence; Regularity;
D O I
10.1007/s40687-022-00370-y
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Neural networks have been very successful in many applications; we often, however, lack a theoretical understanding of what the neural networks are actually learning. This problem emerges when trying to generalise to new data sets. The contribution of this paper is to show that, for the residual neural network model, the deep layer limit coincides with a parameter estimation problem for a nonlinear ordinary differential equation. In particular, whilst it is known that the residual neural network model is a discretisation of an ordinary differential equation, we show convergence in a variational sense. This implies that optimal parameters converge in the deep layer limit. This is a stronger statement than saying for a fixed parameter the residual neural network model converges (the latter does not in general imply the former). Our variational analysis provides a discrete-to-continuum F-convergence result for the objective function of the residual neural network training step to a variational problem constrained by a system of ordinary differential equations; this rigorously connects the discrete setting to a continuum problem.
引用
收藏
页数:44
相关论文
共 50 条
  • [1] Deep limits of residual neural networks (vol 10, 6, 2023)
    Thorpe, Matthew
    van Gennip, Yves
    RESEARCH IN THE MATHEMATICAL SCIENCES, 2024, 11 (02)
  • [2] Deep Residual Learning in Spiking Neural Networks
    Fang, Wei
    Yu, Zhaofei
    Chen, Yanqi
    Huang, Tiejun
    Masquelier, Timothee
    Tian, Yonghong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Aggregated Residual Transformations for Deep Neural Networks
    Xie, Saining
    Girshick, Ross
    Dollar, Piotr
    Tu, Zhuowen
    He, Kaiming
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5987 - 5995
  • [4] Deep Residual Neural Networks for Audio Spoofing Detection
    Alzantot, Mousulfa
    Wang, Ziqi
    Srivastava, Mani B.
    INTERSPEECH 2019, 2019, : 1078 - 1082
  • [5] Deep Residual and Classified Neural Networks for Inverse Halftoning
    Guo, Jing-Ming
    Sankarasrinivasan, S.
    Let Viet Hung
    Liu, Wei
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2053 - 2060
  • [6] Reversible Architectures for Arbitrarily Deep Residual Neural Networks
    Chang, Bo
    Meng, Lili
    Haber, Eldad
    Ruthotto, Lars
    Begert, David
    Holtham, Elliot
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2811 - 2818
  • [7] ARRHYTHMIA CLASSIFICATION USING DEEP RESIDUAL NEURAL NETWORKS
    Shi, Zhenghao
    Yin, Zhiyan
    Ren, Xiaoyong
    Liu, Haiqin
    Chen, Jingguo
    Hei, Xinhong
    Luo, Jing
    You, Zhenzhen
    Zhao, Minghua
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2021, 21 (10)
  • [8] Continuous limits of residual neural networks in case of large input data
    Herty, Michael
    Thuenen, Anna
    Trimborn, Torsten
    Visconti, Giuseppe
    COMMUNICATIONS IN APPLIED AND INDUSTRIAL MATHEMATICS, 2022, 13 (01) : 96 - 120
  • [9] Deep Limits and a Cut-Off Phenomenon for Neural Networks
    Avelin, Benny
    Karlsson, Anders
    Journal of Machine Learning Research, 2022, 23
  • [10] GRAPH EXPANSIONS OF DEEP NEURAL NETWORKS AND THEIR UNIVERSAL SCALING LIMITS
    Cirone, Nicola Muça
    Hamdan, Jad
    Salvi, Cristopher
    arXiv,