On the Saturation Phenomenon of Stochastic Gradient Descent for Linear Inverse Problems*

被引:4
|
作者
Jin, Bangti [1 ]
Zhou, Zehui [2 ]
Zou, Jun [2 ]
机构
[1] UCL, Dept Comp Sci, London WC1E 6BT, England
[2] Chinese Univ Hong Kong, Dept Math, Shatin, Hong Kong, Peoples R China
来源
基金
英国工程与自然科学研究理事会;
关键词
  stochastic gradient descent; regularizing property; convergence rate; saturation; inverse problems; APPROXIMATION; CONVERGENCE;
D O I
10.1137/20M1374456
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Stochastic gradient descent (SGD) is a promising method for solving large-scale inverse problems due to its excellent scalability with respect to data size. The current mathematical theory in the lens of regularization theory predicts that SGD with a polynomially decaying stepsize schedule may suffer from an undesirable saturation phenomenon; i.e., the convergence rate does not further improve with the solution regularity index when it is beyond a certain range. In this work, we present a refined convergence rate analysis of SGD and prove that saturation actually does not occur if the initial stepsize of the schedule is sufficiently small. Several numerical experiments are provided to complement the analysis.
引用
收藏
页码:1553 / 1588
页数:36
相关论文
共 50 条
  • [41] Stochastic gradient descent implementation of the modified forward-backward linear prediction
    Riasati, Vahid R.
    Schuetterle, Patrick G.
    O'Hara, Christopher
    PATTERN RECOGNITION AND TRACKING XXIX, 2018, 10649
  • [42] Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers
    Paquin, Alexandre Lemire
    Chaib-draa, Brahim
    Giguere, Philippe
    NEURAL NETWORKS, 2023, 164 : 382 - 394
  • [43] Large scale semi-supervised linear SVM with stochastic gradient descent
    Zhou, X. (zhouxin@mtlab.hit.edu.cn), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [44] On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems
    Mertikopoulos, Panayotis
    Hallak, Nadav
    Kavis, Ali
    Cevher, Volkan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [45] Convergence of Stochastic Gradient Descent for PCA
    Shamir, Ohad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [46] Stochastic Gradient Descent in Continuous Time
    Sirignano, Justin
    Spiliopoulos, Konstantinos
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2017, 8 (01): : 933 - 961
  • [47] On the Hyperparameters in Stochastic Gradient Descent with Momentum
    Shi, Bin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [48] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [49] On the Generalization of Stochastic Gradient Descent with Momentum
    Ramezani-Kebrya, Ali
    Antonakopoulos, Kimon
    Cevher, Volkan
    Khisti, Ashish
    Liang, Ben
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
  • [50] On the different regimes of stochastic gradient descent
    Sclocchi, Antonio
    Wyart, Matthieu
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 121 (09)