STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT

被引:0
|
作者
Gess, Benjamin [1 ,2 ]
Kassing, Sebastian [3 ]
Rana, Nimit [4 ]
机构
[1] TU Berlin, Inst Math, D-10623 Berlin, Germany
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Bielefeld, Fac Math, D-33615 Bielefeld, Germany
[4] Univ York, Dept Math, York YO10 5DD, England
关键词
Riemannian stochastic gradient descent; diffusion approximation; supervised learning; weak error; Riemannian gradient flow; DIFFUSION-APPROXIMATION; DIFFERENTIAL-EQUATIONS; ALGORITHMS; SEMIGROUPS;
D O I
10.1137/24M163863X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry, we show that, in the small learning rate regime, RSGD can be approximated by the solution to the RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for the random fluctuations of RSGD and, thereby, increases the order of approximation compared to the deterministic Riemannian gradient flow. The RSGD is built using the concept of a retraction map, that is, a cost-efficient approximation of the exponential map, and we prove quantitative bounds for the weak error of the diffusion approximation under assumptions on the retraction map, the geometry of the manifold, and the random estimators of the gradient.
引用
收藏
页码:3288 / 3314
页数:27
相关论文
共 50 条
  • [31] A stochastic multiple gradient descent algorithm
    Mercier, Quentin
    Poirion, Fabrice
    Desideri, Jean-Antoine
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 271 (03) : 808 - 817
  • [32] Efficiency Ordering of Stochastic Gradient Descent
    Hu, Jie
    Doshi, Vishwaraj
    Eun, Do Young
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [33] Conjugate directions for stochastic gradient descent
    Schraudolph, NN
    Graepel, T
    ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 1351 - 1356
  • [34] BAYESIAN STOCHASTIC GRADIENT DESCENT FOR STOCHASTIC OPTIMIZATION WITH STREAMING INPUT DATA
    Liu, Tianyi
    Lin, Yifan
    Zhou, Enlu
    SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (01) : 389 - 418
  • [35] Telescope Alignment Method Using a Modified Stochastic Parallel Gradient Descent Algorithm
    Li, Min
    Liu, Xin
    Zhang, Junbo
    Xian, Hao
    PHOTONICS, 2024, 11 (11)
  • [36] Modified Convolutional Neural Network Based on Dropout and the Stochastic Gradient Descent Optimizer
    Yang, Jing
    Yang, Guanci
    ALGORITHMS, 2018, 11 (03):
  • [37] Stochastic gradient descent implementation of the modified forward-backward linear prediction
    Riasati, Vahid R.
    Schuetterle, Patrick G.
    O'Hara, Christopher
    PATTERN RECOGNITION AND TRACKING XXIX, 2018, 10649
  • [38] Riemannian gradient methods for stochastic composition problems
    Huang, Feihu
    Gao, Shangqian
    NEURAL NETWORKS, 2022, 153 : 224 - 234
  • [39] On the convergence and improvement of stochastic normalized gradient descent
    Shen-Yi ZHAO
    Yin-Peng XIE
    Wu-Jun LI
    ScienceChina(InformationSciences), 2021, 64 (03) : 105 - 117
  • [40] STOCHASTIC GRADIENT DESCENT WITH FINITE SAMPLES SIZES
    Yuan, Kun
    Ying, Bicheng
    Vlaski, Stefan
    Sayed, Ali H.
    2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,