Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

被引:0
|
作者
Cai, Yang [1 ]
Luo, Haipeng [2 ]
Wei, Chen-Yu [3 ]
Zheng, Weiqiang [1 ]
机构
[1] Yale Univ, New Haven, CT 06520 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
[3] Univ Virginia, Charlottesville, VA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study policy optimization algorithms for computing correlated equilibria in multiplayer general-sum Markov Games. Previous results achieve (O) over tilde (T-1/2) convergence rate to a correlated equilibrium and an accelerated (O) over tilde (T-3/4) convergence rate to the weaker notion of coarse correlated equilibrium. In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal (O) over tilde (T-1) convergence rate for computing a correlated equilibrium. Our algorithm is constructed by combining two main elements (i) smooth value updates and (ii) the optimisticfollow-the-regularized-leader algorithm with the log barrier regularizer.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Learning to Correlate in Multi-Player General-Sum Sequential Games
    Celli, Andrea
    Marchesi, Alberto
    Bianchi, Tommaso
    Gatti, Nicola
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [42] Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
    Bai, Yu
    Jin, Chi
    Wang, Huan
    Xiong, Caiming
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [43] Selection of a correlated equilibrium in Markov stopping games
    Ramsey, David M.
    Szajowski, Krzysztof
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2008, 184 (01) : 185 - 206
  • [44] Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers?
    Zhong, Han
    Yang, Zhuoran
    Wang, Zhaoran
    Jordan, Michael I.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [45] An Iterative Quadratic Method for General-Sum Differential Games with Feedback Linearizable Dynamics
    Fridovich-Keil, David
    Rubies-Royo, Vicenc
    Tomlin, Claire J.
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 2216 - 2222
  • [46] General-Sum Multi-Agent Continuous Inverse Optimal Control
    Neumeyer, Christian
    Oliehoek, Frans A.
    Gavrila, Dariu M.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 3429 - 3436
  • [47] Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation
    Zhang, Brian Hu
    Farina, Gabriele
    Celli, Andrea
    Sandholm, Tuomas
    MATHEMATICS OF OPERATIONS RESEARCH, 2025,
  • [48] General-Sum Cyber Deception Games under Partial Attacker Valuation Information
    Thakoor, Omkar
    Tambe, Milind
    Vayanos, Phebe
    Xu, Haifeng
    Kiekintveld, Christopher
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2215 - 2217
  • [49] Pareto-Q learning algorithm for cooperative agents in general-sum games
    Song, MP
    Gu, GC
    Zhang, GY
    MULTI-AGENT SYSTEMS AND APPLICATIONS IV, PROCEEDINGS, 2005, 3690 : 576 - 578
  • [50] Learning to Play General-Sum Games against Multiple Boundedly Rational Agents
    Zhao, Eric
    Trott, Alexander R.
    Xiong, Caiming
    Zheng, Stephan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11781 - 11789