A Game-Theoretic Approach to Multi-agent Trust Region Optimization

被引:0
|
作者
Wen, Ying [1 ]
Chen, Hui [2 ]
Yang, Yaodong [3 ]
Li, Minne [2 ]
Tian, Zheng [4 ]
Chen, Xu [5 ]
Wang, Jun [2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] UCL, London, England
[3] Peking Univ, Beijing, Peoples R China
[4] ShangahiTech Univ, Shanghai, Peoples R China
[5] Renmin Univ, Beijing, Peoples R China
关键词
Multi-agent Reinforcement Learning; Game Theory; Trust Region Optimization;
D O I
10.1007/978-3-031-25549-6_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Trust region methods are widely applied in single-agent reinforcement learning problems due to their monotonic performance-improvement guarantee at every iteration. Nonetheless, when applied in multi-agent settings, the guarantee of trust region methods no longer holds because an agent's payoff is also affected by other agents' adaptive behaviors. To tackle this problem, we conduct a game-theoretical analysis in the policy space, and propose a multi-agent trust region learning method (MATRL), which enables trust region optimization for multi-agent learning. Specifically, MATRL finds a stable improvement direction that is guided by the solution concept of Nash equilibrium at the meta-game level. We derive the monotonic improvement guarantee in multi-agent settings and show the local convergence of MATRL to stable fixed points in differential games. To test our method, we evaluate MATRL in both discrete and continuous multiplayer general-sum games including checker and switch grid worlds, multi-agent MuJoCo, and Atari games. Results suggest that MATRL significantly outperforms strong multi-agent reinforcement learning baselines.
引用
收藏
页码:74 / 87
页数:14
相关论文
共 50 条
  • [1] Extending classical planning to the multi-agent case: A game-theoretic approach
    Ben Larbi, Ramzi
    Konieczny, Sebastien
    Marquis, Pierre
    SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, PROCEEDINGS, 2007, 4724 : 731 - +
  • [2] A game-theoretic learning model in multi-agent systems
    Zhang, C
    Zhang, X
    Wei, JL
    Zhou, ML
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 1511 - 1516
  • [3] Game-theoretic multi-agent motion planning in a mixed environment
    Zhang, Xiaoxue
    Xie, Lihua
    CONTROL THEORY AND TECHNOLOGY, 2024, 22 (03) : 379 - 393
  • [4] Rational verification: game-theoretic verification of multi-agent systems
    Alessandro Abate
    Julian Gutierrez
    Lewis Hammond
    Paul Harrenstein
    Marta Kwiatkowska
    Muhammad Najib
    Giuseppe Perelli
    Thomas Steeples
    Michael Wooldridge
    Applied Intelligence, 2021, 51 : 6569 - 6584
  • [5] Algorithmically identifying strategies in multi-agent game-theoretic environments
    Zaroukian, Erin
    Rodriguez, Sebastian S.
    Barton, Sean L.
    Schaffer, James A.
    Perelman, Brandon
    Waytowich, Nicholas R.
    Hoffman, Blaine
    Asher, Derrik E.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS, 2019, 11006
  • [6] Rational verification: game-theoretic verification of multi-agent systems
    Abate, Alessandro
    Gutierrez, Julian
    Hammond, Lewis
    Harrenstein, Paul
    Kwiatkowska, Marta
    Najib, Muhammad
    Perelli, Giuseppe
    Steeples, Thomas
    Wooldridge, Michael
    APPLIED INTELLIGENCE, 2021, 51 (09) : 6569 - 6584
  • [7] Multi-agent Systems: Algorithmic, Game-theoretic, and Logical Foundations
    Leslie, D.
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2010, 61 (06) : 1066 - 1066
  • [8] The role of information structures in game-theoretic multi-agent learning
    Li, Tao
    Zhao, Yuhan
    Zhu, Quanyan
    ANNUAL REVIEWS IN CONTROL, 2022, 53 : 296 - 314
  • [9] A Non-cooperative Game-Theoretic Approach for Conflict Resolution in Multi-agent Planning
    Jordan, Jaume
    Torreno, Alejandro
    de Weerdt, Mathijs
    Onaindia, Eva
    GROUP DECISION AND NEGOTIATION, 2021, 30 (01) : 7 - 41
  • [10] A Non-cooperative Game-Theoretic Approach for Conflict Resolution in Multi-agent Planning
    Jaume Jordán
    Alejandro Torreño
    Mathijs de Weerdt
    Eva Onaindia
    Group Decision and Negotiation, 2021, 30 : 7 - 41