Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

被引：0

作者：

Zeng, Siliang ^{[1
]}

Chen, Tianyi ^{[2
]}

Garcia, Alfredo ^{[3
]}

Hong, Mingyi ^{[1
]}

机构：

[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA

[2] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12181 USA

[3] Texas A&M Univ, Dept Ind & Syst Engn, College Stn, TX 77843 USA

来源：

LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 168 | 2022年 / 168卷

关键词：

Multi-Agent Reinforcement Learning; Actor-Critic; Parameter Sharing; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

(1)Multi-agent reinforcement learning (MARL) has attracted much research attention recently. However, unlike its single-agent counterpart, many theoretical and algorithmic aspects of MARL have not been well-understood. In this paper, we study the emergence of coordinated behavior by autonomous agents using an actor-critic (AC) algorithm. Specifically, we propose and analyze a class of coordinated actor-critic (CAC) algorithms in which individually parametrized policies have a shared part (which is jointly optimized among all agents) and a personalized part (which is only locally optimized). Such a kind of partially personalized policy allows agents to coordinate by leveraging peers' experience and adapt to individual tasks. The flexibility in our design allows the proposed CAC algorithm to be used in a fully decentralized setting, where the agents can only communicate with their neighbors, as well as in a federated setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models. Theoretically, we show that under some standard regularity assumptions, the proposed CAC algorithm requires O(epsilon-5/2) samples to achieve an epsilon-stationary solution (defined as the solution whose squared norm of the gradient of the objective function is less than epsilon). To the best of our knowledge, this work provides the first finite-sample guarantee for decentralized AC algorithm with partially personalized policies.

引用

页数：13

共 50 条

[31] Dynamic Spectrum Sharing Based on Federated Learning and Multi-Agent Actor-Critic Reinforcement Learning
Yang, Tongtong
Zhang, Wensheng
Bo, Yulian
Sun, Jian
Wang, Cheng-Xiang
2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 947 - 952
[32] Actor-Critic Neural Network Based Finite-time Control for Uncertain Robotic Systems
Lei, Changyi
5TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND DATA MINING (ICISDM 2021), 2021, : 34 - 40
[33] Multi-agent Gradient-Based Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
Ren, Jineng
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
[34] Privacy-Preserving Decentralized Actor-Critic for Cooperative Multi-Agent Reinforcement Learning
Ahmed, Maheed H.
Ghasemi, Mahsa
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[35] PRACM: Predictive Rewards for Actor-Critic with Mixing Function in Multi-Agent Reinforcement Learning
Yu, Sheng
Liu, Bo
Zhu, Wei
Liu, Shuhong
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2023, 2023, 14120 : 69 - 82
[36] Multi-Agent Actor-Critic Multitask Reinforcement Learning based on GTD(1) with Consensus
Stankovic, Milo S. S.
Beko, Marko
Ilic, Nemanja
Stankovic, Srdjan S.
2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4591 - 4596
[37] An Object Oriented Approach to Fuzzy Actor-Critic Learning for Multi-Agent Differential Games
Schwartz, Howard
2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 183 - 190
[38] Multi-agent Actor-Critic Reinforcement Learning Based In-network Load Balance
Mai, Tianle
Yao, Haipeng
Xiong, Zehui
Guo, Song
Niyato, Dusit Tao
2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
[39] Optimal Consensus Control for Continuous-time Multi-agent Systems via Actor-Critic Neural Networks
Jia, Xiao
Wolter, Katinka
2022 8TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2022), 2022, : 191 - 195
[40] Capacity-Limited Decentralized Actor-Critic for Multi-Agent Games
Malloy, Tyler
Sims, Chris R.
Klinger, Tim
Liu, Miao
Riemer, Matthew
Tesauro, Gerald
2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 332 - 339

← 1 2 3 4 5 →