Strong error analysis for stochastic gradient descent optimization algorithms

被引:13
|
作者
Jentzen, Arnulf [1 ]
Kuckuck, Benno [1 ]
Neufeld, Ariel [2 ]
von Wurstemberger, Philippe [3 ]
机构
[1] Univ Munster, Fac Math & Comp Sci, D-48149 Munster, Germany
[2] NTU Singapore, Div Math Sci, Singapore 637371, Singapore
[3] Swiss Fed Inst Technol, Dept Math, CH-8092 Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
Stochastic gradient descent; Stochastic approximation algorithms; Strong error analysis; CONVERGENCE RATE; ROBBINS-MONRO; APPROXIMATION; MOMENTS; RATES;
D O I
10.1093/imanum/drz055
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications. In this article we perform a rigorous strong error analysis for SGD optimization algorithms. In particular, we prove for every arbitrarily small epsilon is an element of (0,infinity) and every arbitrarily large p epsilon (0,infinity) that the considered SGD optimization algorithm converges in the strong L-p-sense with order 1/2-epsilon to the global minimum of the objective function of the considered stochastic optimization problem under standard convexity-type assumptions on the objective function and relaxed assumptions on the moments of the stochastic errors appearing in the employed SGD optimization algorithm. The key ideas in our convergence proof are, first, to employ techniques from the theory of Lyapunov-type functions for dynamical systems to develop a general convergence machinery for SGD optimization algorithms based on such functions, then, to apply this general machinery to concrete Lyapunov-type functions with polynomial structures and, thereafter, to perform an induction argument along the powers appearing in the Lyapunov-type functions in order to achieve for every arbitrarily large p epsilon (0,infinity) strong L-p-convergence rates.
引用
收藏
页码:455 / 492
页数:38
相关论文
共 50 条
  • [31] Asymptotic Analysis via Stochastic Differential Equations of Gradient Descent Algorithms in Statistical and Computational Paradigms
    Wang, Yazhen
    Wu, Shang
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [32] Asymptotic analysis via stochastic differential equations of gradient descent algorithms in statistical and computational paradigms
    Wang, Yazhen
    Wu, Shang
    Journal of Machine Learning Research, 2020, 21
  • [33] Wireless Network Optimization via Stochastic Sub-gradient Descent: Rate Analysis
    Bedi, Amrit Singh
    Rajawat, Ketan
    2018 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2018,
  • [34] ON DISTRIBUTED STOCHASTIC GRADIENT ALGORITHMS FOR GLOBAL OPTIMIZATION
    Swenson, Brian
    Sridhar, Anirudh
    Poor, H. Vincent
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8594 - 8598
  • [35] A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks
    Dogo, E. M.
    Afolabi, O. J.
    Nwulu, N. I.
    Twala, B.
    Aigbavboa, C. O.
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON COMPUTATIONAL TECHNIQUES, ELECTRONICS AND MECHANICAL SYSTEMS (CTEMS), 2018, : 92 - 99
  • [36] Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization
    Vakili, Sattar
    Salgia, Sudeep
    Zhao, Qing
    2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 432 - 438
  • [37] The Minimization of Empirical Risk Through Stochastic Gradient Descent with Momentum Algorithms
    Chaudhuri, Arindam
    ARTIFICIAL INTELLIGENCE METHODS IN INTELLIGENT ALGORITHMS, 2019, 985 : 168 - 181
  • [38] STOCHASTIC GRADIENT DESCENT ALGORITHM FOR STOCHASTIC OPTIMIZATION IN SOLVING ANALYTIC CONTINUATION PROBLEMS
    Bao, Feng
    Maier, Thomas
    FOUNDATIONS OF DATA SCIENCE, 2020, 2 (01): : 1 - 17
  • [39] Robust Pose Graph Optimization Using Stochastic Gradient Descent
    Wang, John
    Olson, Edwin
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 4284 - 4289
  • [40] Stochastic Smoothed Gradient Descent Ascent for Federated Minimax Optimization
    Shen, Wei
    Huang, Minhui
    Zhang, Jiawei
    Shen, Cong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238