Stochastic modified equations for the asynchronous stochastic gradient descent

被引:9
|
作者
An, Jing [1 ]
Lu, Jianfeng [2 ,3 ]
Ying, Lexing [4 ,5 ]
机构
[1] Stanford Univ, Inst Computat & Math Engn, Stanford, CA 94305 USA
[2] Duke Univ, Dept Math, Dept Chem, Box 90320, Durham, NC 27706 USA
[3] Duke Univ, Dept Phys, Box 90320, Durham, NC 27706 USA
[4] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[5] Stanford Univ, Inst Computat & Math Engn ICME, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
stochastic modified equations; asynchronous stochastic gradient descent; optimal control;
D O I
10.1093/imaiai/iaz030
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose stochastic modified equations (SMEs) for modelling the asynchronous stochastic gradient descent (ASGD) algorithms. The resulting SME of Langevin type extracts more information about the ASGD dynamics and elucidates the relationship between different types of stochastic gradient algorithms. We show the convergence of ASGD to the SME in the continuous time limit, as well as the SME's precise prediction to the trajectories of ASGD with various forcing terms. As an application, we propose an optimal mini-batching strategy for ASGD via solving the optimal control problem of the associated SME.
引用
收藏
页码:851 / 873
页数:23
相关论文
共 50 条
  • [1] Asynchronous Stochastic Gradient Descent with Delay Compensation
    Zheng, Shuxin
    Meng, Qi
    Wang, Taifeng
    Chen, Wei
    Yu, Nenghai
    Ma, Zhi-Ming
    Liu, Tie-Yan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] ASYNCHRONOUS STOCHASTIC GRADIENT DESCENT FOR DNN TRAINING
    Zhang, Shanshan
    Zhang, Ce
    You, Zhao
    Zheng, Rong
    Xu, Bo
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6660 - 6663
  • [3] Practical Efficiency of Asynchronous Stochastic Gradient Descent
    Bhardwaj, Onkar
    Cong, Guojing
    PROCEEDINGS OF 2016 2ND WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC), 2016, : 56 - 62
  • [4] Asynchronous Decentralized Parallel Stochastic Gradient Descent
    Lian, Xiangru
    Zhang, Wei
    Zhang, Ce
    Liu, Ji
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [5] Asynchronous Decentralized Accelerated Stochastic Gradient Descent
    Lan G.
    Zhou Y.
    Zhou, Yi (yi.zhou@ibm.com), 1600, Institute of Electrical and Electronics Engineers Inc. (02): : 802 - 811
  • [6] STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT
    Gess, Benjamin
    Kassing, Sebastian
    Rana, Nimit
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2024, 62 (06) : 3288 - 3314
  • [7] Distributed and asynchronous Stochastic Gradient Descent with variance reduction
    Ming, Yuewei
    Zhao, Yawei
    Wu, Chengkun
    Li, Kuan
    Yin, Jianping
    NEUROCOMPUTING, 2018, 281 : 27 - 36
  • [8] Asynchronous Stochastic Gradient Descent Over Decentralized Datasets
    Du, Yubo
    You, Keyou
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2021, 8 (03): : 1212 - 1224
  • [9] Asynchronous Stochastic Gradient Descent over Decentralized Datasets
    Du, Yubo
    You, Keyou
    Mo, Yilin
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 216 - 221
  • [10] The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
    Alistarh, Dan
    De Sa, Christopher
    Konstantinov, Nikola
    PODC'18: PROCEEDINGS OF THE 2018 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2018, : 169 - 177