Stochastic modified equations for the asynchronous stochastic gradient descent

被引:9
|
作者
An, Jing [1 ]
Lu, Jianfeng [2 ,3 ]
Ying, Lexing [4 ,5 ]
机构
[1] Stanford Univ, Inst Computat & Math Engn, Stanford, CA 94305 USA
[2] Duke Univ, Dept Math, Dept Chem, Box 90320, Durham, NC 27706 USA
[3] Duke Univ, Dept Phys, Box 90320, Durham, NC 27706 USA
[4] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[5] Stanford Univ, Inst Computat & Math Engn ICME, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
stochastic modified equations; asynchronous stochastic gradient descent; optimal control;
D O I
10.1093/imaiai/iaz030
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose stochastic modified equations (SMEs) for modelling the asynchronous stochastic gradient descent (ASGD) algorithms. The resulting SME of Langevin type extracts more information about the ASGD dynamics and elucidates the relationship between different types of stochastic gradient algorithms. We show the convergence of ASGD to the SME in the continuous time limit, as well as the SME's precise prediction to the trajectories of ASGD with various forcing terms. As an application, we propose an optimal mini-batching strategy for ASGD via solving the optimal control problem of the associated SME.
引用
收藏
页码:851 / 873
页数:23
相关论文
共 50 条
  • [31] A Stochastic Gradient Descent Approach for Stochastic Optimal Control
    Archibald, Richard
    Bao, Feng
    Yong, Jiongmin
    EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2020, 10 (04) : 635 - 658
  • [32] HogWild plus plus : A New Mechanism for Decentralized Asynchronous Stochastic Gradient Descent
    Zhang, Huan
    Hsieh, Cho-Jui
    Akella, Venkatesh
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 629 - 638
  • [33] Stochastic modified equations and dynamics of stochastic gradient algorithms I: Mathematical foundations
    Li, Qianxiao
    Tai, Cheng
    Weinan, E.
    Journal of Machine Learning Research, 2019, 20
  • [34] Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations
    Li, Qianxiao
    Tai, Cheng
    Weinan, E.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [35] Parametric estimation of stochastic differential equations via online gradient descent
    Nakakita, Shogo
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2024,
  • [36] Continuous-time stochastic gradient descent for optimizing over the stationary distribution of stochastic differential equations
    Wang, Ziheng
    Sirignano, Justin
    MATHEMATICAL FINANCE, 2024, 34 (02) : 348 - 424
  • [37] Estimation of reaction kinetic parameters based on modified stochastic gradient descent
    Tang L.-S.
    Chen W.-F.
    Gao Xiao Hua Xue Gong Cheng Xue Bao/Journal of Chemical Engineering of Chinese Universities, 2022, 36 (03): : 426 - 436
  • [38] An asynchronous parallel stochastic coordinate descent algorithm
    Liu, Ji
    Wright, Stephen J.
    Ré, Christopher
    Bittorf, Victor
    Sridhar, Srikrishna
    Journal of Machine Learning Research, 2015, 16 : 285 - 322
  • [39] An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
    Liu, Ji
    Wright, Stephen J.
    Re, Christopher
    Bittorf, Victor
    Sridhar, Srikrishna
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 469 - 477
  • [40] An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
    Liu, Ji
    Wright, Stephen J.
    Re, Christopher
    Bittorf, Victor
    Sridhar, Srikrishna
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 285 - 322