STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT

被引:0
|
作者
Gess, Benjamin [1 ,2 ]
Kassing, Sebastian [3 ]
Rana, Nimit [4 ]
机构
[1] TU Berlin, Inst Math, D-10623 Berlin, Germany
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Bielefeld, Fac Math, D-33615 Bielefeld, Germany
[4] Univ York, Dept Math, York YO10 5DD, England
关键词
Riemannian stochastic gradient descent; diffusion approximation; supervised learning; weak error; Riemannian gradient flow; DIFFUSION-APPROXIMATION; DIFFERENTIAL-EQUATIONS; ALGORITHMS; SEMIGROUPS;
D O I
10.1137/24M163863X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry, we show that, in the small learning rate regime, RSGD can be approximated by the solution to the RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for the random fluctuations of RSGD and, thereby, increases the order of approximation compared to the deterministic Riemannian gradient flow. The RSGD is built using the concept of a retraction map, that is, a cost-efficient approximation of the exponential map, and we prove quantitative bounds for the weak error of the diffusion approximation under assumptions on the retraction map, the geometry of the manifold, and the random estimators of the gradient.
引用
收藏
页码:3288 / 3314
页数:27
相关论文
共 50 条
  • [41] Predicting Throughput of Distributed Stochastic Gradient Descent
    Li, Zhuojin
    Paolieri, Marco
    Golubchik, Leana
    Lin, Sung-Han
    Yan, Wumo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2900 - 2912
  • [42] Stochastic Multiple Target Sampling Gradient Descent
    Phan, Hoang
    Tran, Ngoc N.
    Le, Trung
    Tran, Toan
    Ho, Nhat
    Phung, Dinh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [43] Convergent Stochastic Almost Natural Gradient Descent
    Sanchez-Lopez, Borja
    Cerquides, Jesus
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 54 - 63
  • [44] Linear Convergence of Adaptive Stochastic Gradient Descent
    Xie, Yuege
    Wu, Xiaoxia
    Ward, Rachel
    arXiv, 2019,
  • [45] Revisiting the Noise Model of Stochastic Gradient Descent
    Battash, Barak
    Wolf, Lior
    Lindenbaum, Ofir
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [46] Stochastic Gradient Descent as Approximate Bayesian Inference
    Mandt, Stephan
    Hoffman, Matthew D.
    Blei, David M.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18
  • [47] Stochastic gradient descent with differentially private updates
    Song, Shuang
    Chaudhuri, Kamalika
    Sarwate, Anand D.
    2013 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2013, : 245 - 248
  • [48] Variance Reduced Stochastic Gradient Descent with Neighbors
    Hofmann, Thomas
    Lucchi, Aurelien
    Lacoste-Julien, Simon
    McWilliams, Brian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [49] Local gain adaptation in stochastic gradient descent
    Schraudolph, NN
    NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2, 1999, (470): : 569 - 574