STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT

被引:0
|
作者
Gess, Benjamin [1 ,2 ]
Kassing, Sebastian [3 ]
Rana, Nimit [4 ]
机构
[1] TU Berlin, Inst Math, D-10623 Berlin, Germany
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Bielefeld, Fac Math, D-33615 Bielefeld, Germany
[4] Univ York, Dept Math, York YO10 5DD, England
关键词
Riemannian stochastic gradient descent; diffusion approximation; supervised learning; weak error; Riemannian gradient flow; DIFFUSION-APPROXIMATION; DIFFERENTIAL-EQUATIONS; ALGORITHMS; SEMIGROUPS;
D O I
10.1137/24M163863X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry, we show that, in the small learning rate regime, RSGD can be approximated by the solution to the RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for the random fluctuations of RSGD and, thereby, increases the order of approximation compared to the deterministic Riemannian gradient flow. The RSGD is built using the concept of a retraction map, that is, a cost-efficient approximation of the exponential map, and we prove quantitative bounds for the weak error of the diffusion approximation under assumptions on the retraction map, the geometry of the manifold, and the random estimators of the gradient.
引用
收藏
页码:3288 / 3314
页数:27
相关论文
共 50 条
  • [21] On the Generalization of Stochastic Gradient Descent with Momentum
    Ramezani-Kebrya, Ali
    Antonakopoulos, Kimon
    Cevher, Volkan
    Khisti, Ashish
    Liang, Ben
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
  • [22] On the different regimes of stochastic gradient descent
    Sclocchi, Antonio
    Wyart, Matthieu
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 121 (09)
  • [23] BACKPROPAGATION AND STOCHASTIC GRADIENT DESCENT METHOD
    AMARI, S
    NEUROCOMPUTING, 1993, 5 (4-5) : 185 - 196
  • [24] Randomized Stochastic Gradient Descent Ascent
    Sebbouh, Othmane
    Cuturi, Marco
    Peyre, Gabriel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [25] Graph Drawing by Stochastic Gradient Descent
    Zheng, Jonathan X.
    Pawar, Samraat
    Goodman, Dan F. M.
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (09) : 2738 - 2748
  • [26] On the discrepancy principle for stochastic gradient descent
    Jahn, Tim
    Jin, Bangti
    INVERSE PROBLEMS, 2020, 36 (09)
  • [27] Nonparametric Budgeted Stochastic Gradient Descent
    Trung Le
    Vu Nguyen
    Tu Dinh Nguyen
    Dinh Phung
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 564 - 572
  • [28] Benign Underfitting of Stochastic Gradient Descent
    Koren, Tomer
    Livni, Roi
    Mansour, Yishay
    Sherman, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [29] The effective noise of stochastic gradient descent
    Mignacco, Francesca
    Urbani, Pierfrancesco
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (08):
  • [30] On the regularizing property of stochastic gradient descent
    Jin, Bangti
    Lu, Xiliang
    INVERSE PROBLEMS, 2019, 35 (01)