Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

被引：0

作者：

Zeng, Siliang ^{[1
]}

Chen, Tianyi ^{[2
]}

Garcia, Alfredo ^{[3
]}

Hong, Mingyi ^{[1
]}

机构：

[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA

[2] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12181 USA

[3] Texas A&M Univ, Dept Ind & Syst Engn, College Stn, TX 77843 USA

来源：

LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 168 | 2022年 / 168卷

关键词：

Multi-Agent Reinforcement Learning; Actor-Critic; Parameter Sharing; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

(1)Multi-agent reinforcement learning (MARL) has attracted much research attention recently. However, unlike its single-agent counterpart, many theoretical and algorithmic aspects of MARL have not been well-understood. In this paper, we study the emergence of coordinated behavior by autonomous agents using an actor-critic (AC) algorithm. Specifically, we propose and analyze a class of coordinated actor-critic (CAC) algorithms in which individually parametrized policies have a shared part (which is jointly optimized among all agents) and a personalized part (which is only locally optimized). Such a kind of partially personalized policy allows agents to coordinate by leveraging peers' experience and adapt to individual tasks. The flexibility in our design allows the proposed CAC algorithm to be used in a fully decentralized setting, where the agents can only communicate with their neighbors, as well as in a federated setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models. Theoretically, we show that under some standard regularity assumptions, the proposed CAC algorithm requires O(epsilon-5/2) samples to achieve an epsilon-stationary solution (defined as the solution whose squared norm of the gradient of the objective function is less than epsilon). To the best of our knowledge, this work provides the first finite-sample guarantee for decentralized AC algorithm with partially personalized policies.

引用

页数：13

共 50 条

[41] Bi-level Multi-Agent Actor-Critic Methods with Transformers
Wan, Tianjiao
Mi, Haibo
Gao, Zijian
Zhai, Yuanzhao
Ding, Bo
Feng, Dawei
2023 IEEE INTERNATIONAL CONFERENCE ON JOINT CLOUD COMPUTING, JCC, 2023, : 9 - 16
[42] Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Lowe, Ryan
Wu, Yi
Tamar, Aviv
Harb, Jean
Abbeel, Pieter
Mordatch, Igor
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[43] Multi-Agent Actor-Critic for Cooperative Resource Allocation in Vehicular Networks
Hammami, Nessrine
Nguyen, Kim Khoa
PROCEEDINGS OF THE 2022 14TH IFIP WIRELESS AND MOBILE NETWORKING CONFERENCE (WMNC 2022), 2022, : 93 - 100
[44] Multi-Agent Reinforcement Learning with General Utilities via Decentralized Shadow Reward Actor-Critic
Zhang, Junyu
Bedi, Amrit Singh
Wang, Mengdi
Koppel, Alec
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9031 - 9039
[45] Entropy regularized actor-critic based multi-agent deep reinforcement learning for stochastic games
Hao, Dong
Zhang, Dongcheng
Shi, Qi
Li, Kai
INFORMATION SCIENCES, 2022, 617 : 17 - 40
[46] Accelerating Fuzzy Actor-Critic Learning via Suboptimal Knowledge for a Multi-Agent Tracking Problem
Wang, Xiao
Ma, Zhe
Mao, Lei
Sun, Kewu
Huang, Xuhui
Fan, Changchao
Li, Jiake
ELECTRONICS, 2023, 12 (08)
[47] Entropy regularized actor-critic based multi-agent deep reinforcement learning for stochastic games
Hao, Dong
Zhang, Dongcheng
Shi, Qi
Li, Kai
Information Sciences, 2022, 617 : 17 - 40
[48] Finite-time Consensus Algorithm of Multi-agent Networks
Khoo, Suiyang
Xie, Lihua
Yu, Zhong
Man, Zhihong
2008 10TH INTERNATIONAL CONFERENCE ON CONTROL AUTOMATION ROBOTICS & VISION: ICARV 2008, VOLS 1-4, 2008, : 916 - +
[49] Multi-Microgrid Collaborative Optimization Scheduling Using an Improved Multi-Agent Soft Actor-Critic Algorithm
Gao, Jiankai
Li, Yang
Wang, Bin
Wu, Haibo
ENERGIES, 2023, 16 (07)
[50] Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization
Karn, Sanjeev Kumar
Liu, Ning
Schuetze, Hinrich
Farri, Oladimeji
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1542 - 1553

← 1 2 3 4 5 →