Stochastic modified equations for the asynchronous stochastic gradient descent

被引:9
|
作者
An, Jing [1 ]
Lu, Jianfeng [2 ,3 ]
Ying, Lexing [4 ,5 ]
机构
[1] Stanford Univ, Inst Computat & Math Engn, Stanford, CA 94305 USA
[2] Duke Univ, Dept Math, Dept Chem, Box 90320, Durham, NC 27706 USA
[3] Duke Univ, Dept Phys, Box 90320, Durham, NC 27706 USA
[4] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[5] Stanford Univ, Inst Computat & Math Engn ICME, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
stochastic modified equations; asynchronous stochastic gradient descent; optimal control;
D O I
10.1093/imaiai/iaz030
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose stochastic modified equations (SMEs) for modelling the asynchronous stochastic gradient descent (ASGD) algorithms. The resulting SME of Langevin type extracts more information about the ASGD dynamics and elucidates the relationship between different types of stochastic gradient algorithms. We show the convergence of ASGD to the SME in the continuous time limit, as well as the SME's precise prediction to the trajectories of ASGD with various forcing terms. As an application, we propose an optimal mini-batching strategy for ASGD via solving the optimal control problem of the associated SME.
引用
收藏
页码:851 / 873
页数:23
相关论文
共 50 条
  • [41] Convergence of Stochastic Gradient Descent for PCA
    Shamir, Ohad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [42] DAC-SGD: A Distributed Stochastic Gradient Descent Algorithm Based on Asynchronous Connection
    He, Aijia
    Chen, Zehong
    Li, Weichen
    Li, Xingying
    Li, Hongjun
    Zhao, Xin
    IIP'17: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION PROCESSING, 2017,
  • [43] Improving Training Time of Deep Neural Network With Asynchronous Averaged Stochastic Gradient Descent
    You, Zhao
    Xu, Bo
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 446 - 449
  • [44] Stochastic Gradient Descent in Continuous Time
    Sirignano, Justin
    Spiliopoulos, Konstantinos
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2017, 8 (01): : 933 - 961
  • [45] On the Hyperparameters in Stochastic Gradient Descent with Momentum
    Shi, Bin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [46] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [47] On the Generalization of Stochastic Gradient Descent with Momentum
    Ramezani-Kebrya, Ali
    Antonakopoulos, Kimon
    Cevher, Volkan
    Khisti, Ashish
    Liang, Ben
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
  • [48] On the different regimes of stochastic gradient descent
    Sclocchi, Antonio
    Wyart, Matthieu
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 121 (09)
  • [49] BACKPROPAGATION AND STOCHASTIC GRADIENT DESCENT METHOD
    AMARI, S
    NEUROCOMPUTING, 1993, 5 (4-5) : 185 - 196
  • [50] Randomized Stochastic Gradient Descent Ascent
    Sebbouh, Othmane
    Cuturi, Marco
    Peyre, Gabriel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151