Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

被引:0
|
作者
Cai, Yang [1 ]
Luo, Haipeng [2 ]
Wei, Chen-Yu [3 ]
Zheng, Weiqiang [1 ]
机构
[1] Yale Univ, New Haven, CT 06520 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
[3] Univ Virginia, Charlottesville, VA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study policy optimization algorithms for computing correlated equilibria in multiplayer general-sum Markov Games. Previous results achieve (O) over tilde (T-1/2) convergence rate to a correlated equilibrium and an accelerated (O) over tilde (T-3/4) convergence rate to the weaker notion of coarse correlated equilibrium. In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal (O) over tilde (T-1) convergence rate for computing a correlated equilibrium. Our algorithm is constructed by combining two main elements (i) smooth value updates and (ii) the optimisticfollow-the-regularized-leader algorithm with the log barrier regularizer.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Safely Using Predictions in General-Sum Normal Form Games
    Damer, Steven
    Gini, Maria
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 924 - 932
  • [32] Nash Q-learning for general-sum stochastic games
    Hu, JL
    Wellman, MP
    JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (06) : 1039 - 1069
  • [33] A Bayesian reinforcement learning approach in markov games for computing near-optimal policies
    Julio B. Clempner
    Annals of Mathematics and Artificial Intelligence, 2023, 91 : 675 - 690
  • [34] A Bayesian reinforcement learning approach in markov games for computing near-optimal policies
    Clempner, Julio B.
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2023, 91 (05) : 675 - 690
  • [35] Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
    Zhang, Kaiqing
    Kakade, Sham M.
    Basar, Tamer
    Yang, Lin F.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [36] Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
    Zhang, Kaiqing
    Kakade, Sham M.
    Basar, Tamer
    Yang, Lin F.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [37] Near-Optimal No-Regret Learning Dynamics for General Convex Games
    Farina, Gabriele
    Anagnostides, Ioannis
    Luo, Haipeng
    Lee, Chung-Wei
    Kroer, Christian
    Sandholm, Tuomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [38] Double-Oracle Sampling Method for Stackelberg Equilibrium Approximation in General-Sum Extensive-Form Games
    Karwowski, Jan
    Mandziuk, Jacek
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 2054 - 2061
  • [39] Stackelberg Equilibrium Approximation in General-Sum Extensive-Form Games with Double-Oracle Sampling Method
    Karwowski, Jan
    Mandziuk, Jacek
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2045 - 2047
  • [40] Identifying and Responding to Cooperative Actions in General-sum Normal Form Games
    Damer, Steven
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1826 - 1827