Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

被引:0
|
作者
Cai, Yang [1 ]
Luo, Haipeng [2 ]
Wei, Chen-Yu [3 ]
Zheng, Weiqiang [1 ]
机构
[1] Yale Univ, New Haven, CT 06520 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
[3] Univ Virginia, Charlottesville, VA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study policy optimization algorithms for computing correlated equilibria in multiplayer general-sum Markov Games. Previous results achieve (O) over tilde (T-1/2) convergence rate to a correlated equilibrium and an accelerated (O) over tilde (T-3/4) convergence rate to the weaker notion of coarse correlated equilibrium. In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal (O) over tilde (T-1) convergence rate for computing a correlated equilibrium. Our algorithm is constructed by combining two main elements (i) smooth value updates and (ii) the optimisticfollow-the-regularized-leader algorithm with the log barrier regularizer.
引用
收藏
页数:20
相关论文
共 50 条
  • [11] Learning in Markov Games: can we exploit a general-sum opponent?
    Ramponi, Giorgia
    Restelli, Marcello
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 1665 - 1675
  • [12] A Near-optimal High-probability Swap-regret Upper Bound for Multi-agent Bandits in Unknown General-sum Games
    Huang, Zhiming
    Pan, Jianping
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 911 - 921
  • [13] Policy Invariance under Reward Transformations for General-Sum Stochastic Games
    Lu, Xiaosong
    Schwartz, Howard M.
    Givigi, Sidney N., Jr.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2011, 41 : 397 - 406
  • [14] Convergence of Decentralized Actor-Critic Algorithm in General-Sum Markov Games
    Maheshwari, Chinmay
    Wu, Manxi
    Sastry, Shankar
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2643 - 2648
  • [15] Convergent gradient ascent in general-sum games
    Banerjee, B
    Peng, J
    MACHINE LEARNING: ECML 2002, 2002, 2430 : 1 - 9
  • [16] On the complexity of computing Markov perfect equilibrium in general-sum stochastic games (vol 10, nwac256, 2023)
    Deng, Xiaotie
    Li, Ningyuan
    Mguni, David
    Wang, Jun
    Yang, Yaodong
    NATIONAL SCIENCE REVIEW, 2023, 10 (02)
  • [17] Robustness and Sample Complexity of Model-Based MARL for General-Sum Markov Games
    Subramanian, Jayakumar
    Sinha, Amit
    Mahajan, Aditya
    DYNAMIC GAMES AND APPLICATIONS, 2023, 13 (01) : 56 - 88
  • [18] Robustness and Sample Complexity of Model-Based MARL for General-Sum Markov Games
    Jayakumar Subramanian
    Amit Sinha
    Aditya Mahajan
    Dynamic Games and Applications, 2023, 13 : 56 - 88
  • [19] Convergence of Policy Gradient Methods for Nash Equilibria in General-sum Stochastic Games
    Chen, Yan
    Li, Tao
    IFAC PAPERSONLINE, 2023, 56 (02): : 3435 - 3440
  • [20] A Robust Optimization Approach to Designing Near-Optimal Strategies for Constant-Sum Monitoring Games
    Rahmattalabi, Aida
    Vayanos, Phebe
    Tambe, Milind
    DECISION AND GAME THEORY FOR SECURITY, GAMESEC 2018, 2018, 11199 : 603 - 622