Deep limits of residual neural networks

被引：8

作者：

Thorpe, Matthew ^{[1
,2
]}

van Gennip, Yves ^{[3
]}

机构：

[1] Univ Manchester, Dept Math, Manchester M13 9PL, England

[2] Alan Turing Inst, London NW1 2DB, England

[3] Delft Univ Technol, Delft Inst Appl Math, NL-2628 CD Delft, Netherlands

来源：

RESEARCH IN THE MATHEMATICAL SCIENCES | 2023年 / 10卷 / 01期

基金：

欧洲研究理事会;

关键词：

Deep neural networks; Ordinary differential equations; Deep layer limits; Variational convergence; Gamma-convergence; Regularity;

D O I：

10.1007/s40687-022-00370-y

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Neural networks have been very successful in many applications; we often, however, lack a theoretical understanding of what the neural networks are actually learning. This problem emerges when trying to generalise to new data sets. The contribution of this paper is to show that, for the residual neural network model, the deep layer limit coincides with a parameter estimation problem for a nonlinear ordinary differential equation. In particular, whilst it is known that the residual neural network model is a discretisation of an ordinary differential equation, we show convergence in a variational sense. This implies that optimal parameters converge in the deep layer limit. This is a stronger statement than saying for a fixed parameter the residual neural network model converges (the latter does not in general imply the former). Our variational analysis provides a discrete-to-continuum F-convergence result for the objective function of the residual neural network training step to a variational problem constrained by a system of ordinary differential equations; this rigorously connects the discrete setting to a continuum problem.

引用

页数：44

共 50 条

[21] Delineation of Road Networks Using Deep Residual Neural Networks and Iterative Hough Transform
Xu, Pinjing
Poullis, Charalambos
ADVANCES IN VISUAL COMPUTING, ISVC 2019, PT I, 2020, 11844 : 32 - 44
[22] Generalization bounds for neural ordinary differential equations and deep residual networks
Marion, Pierre
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[23] Fully Convolutional Deep Residual Neural Networks for Brain Tumor Segmentation
Chang, Peter D.
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, 2016, 2016, 10154 : 108 - 118
[24] Learning deep hierarchical and temporal recurrent neural networks with residual learning
Tehseen Zia
Assad Abbas
Usman Habib
Muhammad Sajid Khan
International Journal of Machine Learning and Cybernetics, 2020, 11 : 873 - 882
[25] Development of residual learning in deep neural networks for computer vision: A survey
Xu, Guoping
Wang, Xiaxia
Wu, Xinglong
Leng, Xuesong
Xu, Yongchao
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 142
[26] Concrete Crack Detection Algorithm Based on Deep Residual Neural Networks
Meng, Xiuying
SCIENTIFIC PROGRAMMING, 2021, 2021
[27] Fine-Grained Channel Pruning for Deep Residual Neural Networks
Chen, Siang
Huang, Kai
Xiong, Dongliang
Li, Bowen
Claesen, Luc
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 3 - 14
[28] FULLY AUTOMATED CLASSIFICATION OF MAMMOGRAMS USING DEEP RESIDUAL NEURAL NETWORKS
Dhungel, Neeraj
Carneiro, Gustavo
Bradley, Andrew P.
2017 IEEE 14TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2017), 2017, : 310 - 314
[29] A Deep Fourier Residual method for solving PDEs using Neural Networks
Taylor, Jamie M.
Pardo, David
Muga, Ignacio
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2023, 405
[30] Parameterization for polynomial curve approximation via residual deep neural networks
Scholz, Felix
Juettler, Bert
COMPUTER AIDED GEOMETRIC DESIGN, 2021, 85

← 1 2 3 4 5 →