Asymptotic Convergence Rate of Dropout on Shallow Linear Neural Networks

被引：1

作者：

Senen-Cerda, Albert ^{[1
]}

Sanders, Jaron ^{[1
]}

机构：

[1] Eindhoven Univ Technol, Eindhoven, Netherlands

来源：

PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS | 2022年 / 6卷 / 02期

关键词：

Dropout; neural networks; convergence rate; gradient flow;

D O I：

10.1145/3530898

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We analyze the convergence rate of gradient flows on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks (NNs)-which can also be viewed as doing matrix factorization using a particular regularizer. Dropout algorithms such as these are thus regularization techniques that use {0, 1}-valued random variables to filter weights during training in order to avoid coadaptation of features. By leveraging a recent result on nonconvex optimization and conducting a careful analysis of the set of minimizers as well as the Hessian of the loss function, we are able to obtain (i) a local convergence proof of the gradient flow and (ii) a bound on the convergence rate that depends on the data, the dropout probability, and the width of the NN. Finally, we compare this theoretical bound to numerical simulations, which are in qualitative agreement with the convergence bound and match it when starting sufficiently close to a minimizer.

引用

页数：53

共 50 条

[1] On convergence rate of projection neural networks
Xia, YS
Feng, G
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (01) : 91 - 96
[2] A Rate of Convergence of Physics Informed Neural Networks for the Linear Second Order Elliptic PDEs
Jiao, Yuling
Lai, Yanming
Li, Dingwei
Lu, Xiliang
Wang, Fengru
Wang, Yang
Yang, Jerry Zhijian
COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2022, 31 (04) : 1272 - 1295
[3] NORMALIZATION EFFECTS ON SHALLOW NEURAL NETWORKS AND RELATED ASYMPTOTIC EXPANSIONS
Yu, Jiahui
Spiliopoulos, Konstantinos
FOUNDATIONS OF DATA SCIENCE, 2021, 3 (02): : 151 - 200
[4] ON THE RATE OF CONVERGENCE IN TOPOLOGY PRESERVING NEURAL NETWORKS
LO, ZP
BAVARIAN, B
BIOLOGICAL CYBERNETICS, 1991, 65 (01) : 55 - 63
[5] On the Convergence Rate of Training Recurrent Neural Networks
Allen-Zhu, Zeyuan
Li, Yuanzhi
Song, Zhao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[6] Convergence rates for shallow neural networks learned by gradient descent
Braun, Alina
Kohler, Michael
Langer, Sophie
Walk, Harro
BERNOULLI, 2024, 30 (01) : 475 - 502
[7] GLOBAL CONVERGENCE AND ASYMPTOTIC STABILITY OF ASYMMETRIC HOPFIELD NEURAL NETWORKS
XU, ZB
KWONG, CP
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1995, 191 (03) : 405 - 427
[8] The Global Optimization Geometry of Shallow Linear Neural Networks
Zhu, Zhihui
Soudry, Daniel
Eldar, Yonina C.
Wakin, Michael B.
JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2020, 62 (03) : 279 - 292
[9] The Global Optimization Geometry of Shallow Linear Neural Networks
Zhihui Zhu
Daniel Soudry
Yonina C. Eldar
Michael B. Wakin
Journal of Mathematical Imaging and Vision, 2020, 62 : 279 - 292
[10] The Asymptotic Performance of Linear Echo State Neural Networks
Couillet, Romain
Wainrib, Gilles
Sevi, Harry
Ali, Hafiz Tiomoko
JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17

← 1 2 3 4 5 →