Best k-Layer Neural Network Approximations

被引:1
|
作者
Lim, Lek-Heng [1 ]
Michalek, Mateusz [2 ,3 ]
Qi, Yang [4 ]
机构
[1] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Konstanz, D-78457 Constance, Germany
[4] Ecole Polytech, INRIA Saclay Ile France, CMAP, IP Paris,CNRS, F-91128 Palaiseau, France
关键词
Neural network; Best approximation; Join loci; Secant loci;
D O I
10.1007/s00365-021-09545-2
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We show that the empirical risk minimization (ERM) problem for neural networks has no solution in general. Given a training set s(1), ..., s(n) is an element of R-p with corresponding responses t(1), ..., t(n) is an element of R-q, fitting a k-layer neural network v(theta) : R-p -> R-q involves estimation of the weights theta is an element of R-m via an ERM: inf(theta is an element of Rm)Sigma(n)(i=1)parallel to t(i) - v(theta)(s(i))parallel to(2)(2). We show that even for k = 2, this infimum is not attainable in general for common activations like ReLU, hyperbolic tangent, and sigmoid functions. In addition, we deduce that if one attempts to minimize such a loss function in the event when its infimum is not attainable, it necessarily results in values of theta diverging to +/-infinity. We will show that for smooth activations sigma(x) = 1/(1 + exp(-x)) and sigma(x) = tanh(x), such failure to attain an infimum can happen on a positive-measured subset of responses. For the ReLU activation sigma(x) = max(0, x), we completely classify cases where the ERM for a best two-layer neural network approximation attains its infimum. In recent applications of neural networks, where overfitting is commonplace, the failure to attain an infimum is avoided by ensuring that the system of equations t(i) = v(theta)(s(i)), i = 1, ..., n, has a solution. For a two-layer ReLU-activated network, we will show when such a system of equations has a solution generically, i.e., when can such a neural network be fitted perfectly with probability one.
引用
收藏
页码:583 / 604
页数:22
相关论文
共 50 条
  • [41] Intelligent Friction Modeling and Compensation Using Neural Network Approximations
    Huang, Sunan
    Tan, Kok Kiong
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2012, 59 (08) : 3342 - 3349
  • [42] Artificial neural network approximations of linear fractional neutron models
    Vyawahare, Vishwesh A.
    Espinosa-Paredes, Gilberto
    Datkhile, Gaurav
    Kadam, Pratik
    ANNALS OF NUCLEAR ENERGY, 2018, 113 : 75 - 88
  • [43] Example Guided Synthesis of Linear Approximations for Neural Network Verification
    Paulsen, Brandon
    Wang, Chao
    COMPUTER AIDED VERIFICATION (CAV 2022), PT I, 2022, 13371 : 149 - 170
  • [44] Mutual Information of Neural Network Initialisations: Mean Field Approximations
    Tanner, Jared
    Ughi, Giuseppe
    2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 813 - 818
  • [45] Efficient Parametric Approximations of Neural Network Function Space Distance
    Dhawan, Nikita
    Huang, Sicong
    Bae, Juhan
    Grosse, Roger
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [46] Energy Efficient Neural Computing: A Study of Cross-Layer Approximations
    Sarwar, Syed Shakib
    Srinivasan, Gopalakrishnan
    Han, Bing
    Wijesinghe, Parami
    Jaiswal, Akhilesh
    Panda, Priyadarshini
    Raghunathan, Anand
    Roy, Kaushik
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2018, 8 (04) : 796 - 809
  • [47] Multifeedback-layer neural network
    Savran, Aydogan
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (02): : 373 - 384
  • [48] Adaptive two-layer ReLU neural network: I. Best least-squares approximation
    Liu, Min
    Cai, Zhiqiang
    Chen, Jingshuang
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2022, 113 : 34 - 44
  • [49] Comparison of single best artificial neural network and neural network ensemble in modeling of palladium microextraction
    Dehghanian, Effat
    Kaykhaii, Massoud
    Mehrpur, Maryam
    MONATSHEFTE FUR CHEMIE, 2015, 146 (08): : 1217 - 1227
  • [50] The best approximations of functions and approximations by Steklov's functions
    Lanina, EG
    VESTNIK MOSKOVSKOGO UNIVERSITETA SERIYA 1 MATEMATIKA MEKHANIKA, 2000, (02): : 49 - 52