Asymptotic Convergence Rate of Dropout on Shallow Linear Neural Networks

被引:1
|
作者
Senen-Cerda, Albert [1 ]
Sanders, Jaron [1 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
关键词
Dropout; neural networks; convergence rate; gradient flow;
D O I
10.1145/3530898
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We analyze the convergence rate of gradient flows on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks (NNs)-which can also be viewed as doing matrix factorization using a particular regularizer. Dropout algorithms such as these are thus regularization techniques that use {0, 1}-valued random variables to filter weights during training in order to avoid coadaptation of features. By leveraging a recent result on nonconvex optimization and conducting a careful analysis of the set of minimizers as well as the Hessian of the loss function, we are able to obtain (i) a local convergence proof of the gradient flow and (ii) a bound on the convergence rate that depends on the data, the dropout probability, and the width of the NN. Finally, we compare this theoretical bound to numerical simulations, which are in qualitative agreement with the convergence bound and match it when starting sufficiently close to a minimizer.
引用
收藏
页数:53
相关论文
共 50 条
  • [1] On convergence rate of projection neural networks
    Xia, YS
    Feng, G
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (01) : 91 - 96
  • [2] A Rate of Convergence of Physics Informed Neural Networks for the Linear Second Order Elliptic PDEs
    Jiao, Yuling
    Lai, Yanming
    Li, Dingwei
    Lu, Xiliang
    Wang, Fengru
    Wang, Yang
    Yang, Jerry Zhijian
    COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2022, 31 (04) : 1272 - 1295
  • [3] NORMALIZATION EFFECTS ON SHALLOW NEURAL NETWORKS AND RELATED ASYMPTOTIC EXPANSIONS
    Yu, Jiahui
    Spiliopoulos, Konstantinos
    FOUNDATIONS OF DATA SCIENCE, 2021, 3 (02): : 151 - 200
  • [4] ON THE RATE OF CONVERGENCE IN TOPOLOGY PRESERVING NEURAL NETWORKS
    LO, ZP
    BAVARIAN, B
    BIOLOGICAL CYBERNETICS, 1991, 65 (01) : 55 - 63
  • [5] On the Convergence Rate of Training Recurrent Neural Networks
    Allen-Zhu, Zeyuan
    Li, Yuanzhi
    Song, Zhao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] Convergence rates for shallow neural networks learned by gradient descent
    Braun, Alina
    Kohler, Michael
    Langer, Sophie
    Walk, Harro
    BERNOULLI, 2024, 30 (01) : 475 - 502
  • [7] GLOBAL CONVERGENCE AND ASYMPTOTIC STABILITY OF ASYMMETRIC HOPFIELD NEURAL NETWORKS
    XU, ZB
    KWONG, CP
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1995, 191 (03) : 405 - 427
  • [8] The Global Optimization Geometry of Shallow Linear Neural Networks
    Zhu, Zhihui
    Soudry, Daniel
    Eldar, Yonina C.
    Wakin, Michael B.
    JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2020, 62 (03) : 279 - 292
  • [9] The Global Optimization Geometry of Shallow Linear Neural Networks
    Zhihui Zhu
    Daniel Soudry
    Yonina C. Eldar
    Michael B. Wakin
    Journal of Mathematical Imaging and Vision, 2020, 62 : 279 - 292
  • [10] The Asymptotic Performance of Linear Echo State Neural Networks
    Couillet, Romain
    Wainrib, Gilles
    Sevi, Harry
    Ali, Hafiz Tiomoko
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17