Asymptotic Convergence Rate of Dropout on Shallow Linear Neural Networks

被引:1
|
作者
Senen-Cerda, Albert [1 ]
Sanders, Jaron [1 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
关键词
Dropout; neural networks; convergence rate; gradient flow;
D O I
10.1145/3530898
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We analyze the convergence rate of gradient flows on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks (NNs)-which can also be viewed as doing matrix factorization using a particular regularizer. Dropout algorithms such as these are thus regularization techniques that use {0, 1}-valued random variables to filter weights during training in order to avoid coadaptation of features. By leveraging a recent result on nonconvex optimization and conducting a careful analysis of the set of minimizers as well as the Hessian of the loss function, we are able to obtain (i) a local convergence proof of the gradient flow and (ii) a bound on the convergence rate that depends on the data, the dropout probability, and the width of the NN. Finally, we compare this theoretical bound to numerical simulations, which are in qualitative agreement with the convergence bound and match it when starting sufficiently close to a minimizer.
引用
收藏
页数:53
相关论文
共 50 条
  • [41] On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks
    Luo, Zhi-Quan
    NEURAL COMPUTATION, 1991, 3 (02) : 226 - 245
  • [42] Checkerboard Dropout: A Structured Dropout With Checkerboard Pattern for Convolutional Neural Networks
    Nguyen, Khanh-Binh
    Choi, Jaehyuk
    Yang, Joon-Sung
    IEEE ACCESS, 2022, 10 : 76044 - 76054
  • [43] Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks
    Rotskoff, Grant M.
    Vanden-Eijnden, Eric
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [44] On the domain of attraction and convergence rate of Hopfield continuous feedback neural networks
    Cao, JD
    PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 3277 - 3280
  • [45] A Novel Learning Rate Schedule in Optimization for Neural Networks and It's Convergence
    Park, Jieun
    Yi, Dokkyun
    Ji, Sangmin
    SYMMETRY-BASEL, 2020, 12 (04):
  • [46] Systemical convergence rate analysis of convex incremental feedforward neural networks
    Chen, Lei
    Huang, Guang-Bin
    Pung, Hung Keng
    NEUROCOMPUTING, 2009, 72 (10-12) : 2627 - 2635
  • [47] Towards dropout training for convolutional neural networks
    Wu, Haibing
    Gu, Xiaodong
    NEURAL NETWORKS, 2015, 71 : 1 - 10
  • [48] Variational Dropout Sparsifies Deep Neural Networks
    Molchanov, Dmitry
    Ashukha, Arsenii
    Vetrov, Dmitry
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [49] Augmenting Recurrent Neural Networks Resilience by Dropout
    Bacciu, Davide
    Crecchi, Francesco
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (01) : 345 - 351
  • [50] Dropout Rademacher complexity of deep neural networks
    Wei GAO
    Zhi-Hua ZHOU
    Science China(Information Sciences), 2016, 59 (07) : 173 - 184