Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

被引：0

作者：

Zeng, Siliang ^{[1
]}

Chen, Tianyi ^{[2
]}

Garcia, Alfredo ^{[3
]}

Hong, Mingyi ^{[1
]}

机构：

[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA

[2] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12181 USA

[3] Texas A&M Univ, Dept Ind & Syst Engn, College Stn, TX 77843 USA

来源：

LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 168 | 2022年 / 168卷

关键词：

Multi-Agent Reinforcement Learning; Actor-Critic; Parameter Sharing; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

(1)Multi-agent reinforcement learning (MARL) has attracted much research attention recently. However, unlike its single-agent counterpart, many theoretical and algorithmic aspects of MARL have not been well-understood. In this paper, we study the emergence of coordinated behavior by autonomous agents using an actor-critic (AC) algorithm. Specifically, we propose and analyze a class of coordinated actor-critic (CAC) algorithms in which individually parametrized policies have a shared part (which is jointly optimized among all agents) and a personalized part (which is only locally optimized). Such a kind of partially personalized policy allows agents to coordinate by leveraging peers' experience and adapt to individual tasks. The flexibility in our design allows the proposed CAC algorithm to be used in a fully decentralized setting, where the agents can only communicate with their neighbors, as well as in a federated setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models. Theoretically, we show that under some standard regularity assumptions, the proposed CAC algorithm requires O(epsilon-5/2) samples to achieve an epsilon-stationary solution (defined as the solution whose squared norm of the gradient of the objective function is less than epsilon). To the best of our knowledge, this work provides the first finite-sample guarantee for decentralized AC algorithm with partially personalized policies.

引用

页数：13

共 50 条

[1] On Finite-Time Convergence of Actor-Critic Algorithm
Qiu S.
Yang Z.
Ye J.
Wang Z.
IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 652 - 664
[2] Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
Xiao, Yuchen
Tan, Weihao
Amato, Christopher
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[3] A New Advantage Actor-Critic Algorithm For Multi-Agent Environments
Paczolay, Gabor
Harmati, Istvan
2020 23RD IEEE INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR), 2020,
[4] Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning
Diddigi, Raghuram Bharadwaj
Reddy, D. Sai Koti
Prabuchandran, K. J.
Bhatnagar, Shalabh
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1931 - 1933
[5] Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms
Prashant Trivedi
Nandyala Hemachandra
Dynamic Games and Applications, 2023, 13 : 25 - 55
[6] Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
Christianos, Filippos
Schafer, Lukas
Albrecht, Stefano V.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[7] Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms
Trivedi, Prashant
Hemachandra, Nandyala
DYNAMIC GAMES AND APPLICATIONS, 2023, 13 (01) : 25 - 55
[8] A multi-agent reinforcement learning using Actor-Critic methods
Li, Chun-Gui
Wang, Meng
Yuan, Qing-Neng
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 878 - 882
[9] Distributed Multi-Agent Reinforcement Learning by Actor-Critic Method
Heredia, Paulo C.
Mou, Shaoshuai
IFAC PAPERSONLINE, 2019, 52 (20): : 363 - 368
[10] Multi-agent actor-critic with time dynamical opponent model
Tian, Yuan
Kladny, Klaus -Rudolf
Wang, Qin
Huang, Zhiwu
Fink, Olga
NEUROCOMPUTING, 2023, 517 : 165 - 172

← 1 2 3 4 5 →