STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT

被引:0
|
作者
Gess, Benjamin [1 ,2 ]
Kassing, Sebastian [3 ]
Rana, Nimit [4 ]
机构
[1] TU Berlin, Inst Math, D-10623 Berlin, Germany
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Bielefeld, Fac Math, D-33615 Bielefeld, Germany
[4] Univ York, Dept Math, York YO10 5DD, England
关键词
Riemannian stochastic gradient descent; diffusion approximation; supervised learning; weak error; Riemannian gradient flow; DIFFUSION-APPROXIMATION; DIFFERENTIAL-EQUATIONS; ALGORITHMS; SEMIGROUPS;
D O I
10.1137/24M163863X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry, we show that, in the small learning rate regime, RSGD can be approximated by the solution to the RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for the random fluctuations of RSGD and, thereby, increases the order of approximation compared to the deterministic Riemannian gradient flow. The RSGD is built using the concept of a retraction map, that is, a cost-efficient approximation of the exponential map, and we prove quantitative bounds for the weak error of the diffusion approximation under assumptions on the retraction map, the geometry of the manifold, and the random estimators of the gradient.
引用
收藏
页码:3288 / 3314
页数:27
相关论文
共 50 条
  • [1] Stochastic Gradient Descent on Riemannian Manifolds
    Bonnabel, Silvere
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2013, 58 (09) : 2217 - 2229
  • [2] Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
    Gess, Benjamin
    Kassing, Sebastian
    Konarovskyi, Vitalii
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [3] Stochastic modified equations for the asynchronous stochastic gradient descent
    An, Jing
    Lu, Jianfeng
    Ying, Lexing
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (04) : 851 - 873
  • [4] CONVERGENCE OF RIEMANNIAN STOCHASTIC GRADIENT DESCENT ON HADAMARD MANIFOLD
    Sakai, Hiroyuki
    Iiduka, Hideaki
    PACIFIC JOURNAL OF OPTIMIZATION, 2024, 20 (04): : 743 - 767
  • [5] Riemannian proximal stochastic gradient descent for sparse 2DPCA
    Zhang, Zhuan
    Zhou, Shuisheng
    Li, Dong
    Yang, Ting
    DIGITAL SIGNAL PROCESSING, 2022, 122
  • [6] Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
    Whiting, Wes
    Wang, Bao
    Xin, Jack
    COMMUNICATIONS ON APPLIED MATHEMATICS AND COMPUTATION, 2024, 6 (02) : 1175 - 1188
  • [7] Adaptive Riemannian stochastic gradient descent and reparameterization for Gaussian mixture model fitting
    Ji, Chunlin
    Fu, Yuhao
    He, Ping
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [8] Unforgeability in Stochastic Gradient Descent
    Baluta, Teodora
    Nikolic, Ivica
    Jain, Racchit
    Aggarwal, Divesh
    Saxena, Prateek
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
  • [9] Preconditioned Stochastic Gradient Descent
    Li, Xi-Lin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
  • [10] Stochastic gradient descent tricks
    Bottou, Léon
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436