Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm

被引:0
|
作者
Zhu, Miaoxi [1 ,2 ]
Shen, Li [3 ]
Du, Bo [1 ,2 ]
Tao, Dacheng [4 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Natl Engn Res Ctr Multimedia Software, Inst Artificial Intelligence, Wuhan, Peoples R China
[2] Wuhan Univ, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China
[3] JD Explore Acad, Beijing, Peoples R China
[4] Univ Sydney, Sydney, NSW, Australia
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The growing size of available data has attracted increasing interest in solving minimax problems in a decentralized manner for various machine learning tasks. Previous theoretical research has primarily focused on the convergence rate and communication complexity of decentralized minimax algorithms, with little attention given to their generalization. In this paper, we investigate the primal-dual generalization bound of the decentralized stochastic gradient descent ascent (D-SGDA) algorithm using the approach of algorithmic stability under both convex-concave and nonconvex-nonconcave settings. Our theory refines the algorithmic stability in a decentralized manner and demonstrates that the decentralized structure does not destroy the stability and generalization of D-SGDA, implying that it can generalize as well as the vanilla SGDA in certain situations. Our results analyze the impact of different topologies on the generalization bound of the D-SGDA algorithm beyond trivial factors such as sample sizes, learning rates, and iterations. We also evaluate the optimization error and balance it with the generalization gap to obtain the optimal population risk of D-SGDA in the convex-concave setting. Additionally, we perform several numerical experiments which validate our theoretical findings.
引用
收藏
页数:35
相关论文
共 50 条
  • [41] Convergence behavior of diffusion stochastic gradient descent algorithm
    Barani, Fatemeh
    Savadi, Abdorreza
    Yazdi, Hadi Sadoghi
    SIGNAL PROCESSING, 2021, 183
  • [42] ON THE CONVERGENCE OF DECENTRALIZED GRADIENT DESCENT
    Yuan, Kun
    Ling, Qing
    Yin, Wotao
    SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (03) : 1835 - 1854
  • [43] On Nonconvex Decentralized Gradient Descent
    Zeng, Jinshan
    Yin, Wotao
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (11) : 2834 - 2848
  • [44] Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
    Yan, Yan
    Xu, Yi
    Lin, Qihang
    Liu, Wei
    Yang, Tianbao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [45] Bandwidth and stability of the stochastic parallel gradient descent algorithm for phase control in coherent beam combination
    Bjorck, Matts
    Henriksson, Markus
    Sjokvist, Lars
    APPLIED OPTICS, 2021, 60 (15) : 4366 - 4374
  • [46] DSA: Decentralized Double Stochastic Averaging Gradient Algorithm
    Mokhtari, Aryan
    Ribeiro, Alejandro
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [47] Stability of Decentralized Gradient Descent in Open Multi-Agent Systems
    Hendrickx, Julien M.
    Rabbat, Michael G.
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 4885 - 4890
  • [48] Stability and Generalization for Markov Chain Stochastic Gradient Methods
    Wang, Puyu
    Lei, Yunwen
    Ying, Yiming
    Zhou, Ding-Xuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [49] Stability and Generalization of Stochastic Gradient Methods for Minimax Problems
    Lei, Yunwen
    Yang, Zhenhuan
    Yang, Tianbao
    Ying, Yiming
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [50] HogWild plus plus : A New Mechanism for Decentralized Asynchronous Stochastic Gradient Descent
    Zhang, Huan
    Hsieh, Cho-Jui
    Akella, Venkatesh
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 629 - 638