Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

被引:0
|
作者
Zeng, Siliang [1 ]
Chen, Tianyi [2 ]
Garcia, Alfredo [3 ]
Hong, Mingyi [1 ]
机构
[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA
[2] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12181 USA
[3] Texas A&M Univ, Dept Ind & Syst Engn, College Stn, TX 77843 USA
关键词
Multi-Agent Reinforcement Learning; Actor-Critic; Parameter Sharing; OPTIMIZATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
(1)Multi-agent reinforcement learning (MARL) has attracted much research attention recently. However, unlike its single-agent counterpart, many theoretical and algorithmic aspects of MARL have not been well-understood. In this paper, we study the emergence of coordinated behavior by autonomous agents using an actor-critic (AC) algorithm. Specifically, we propose and analyze a class of coordinated actor-critic (CAC) algorithms in which individually parametrized policies have a shared part (which is jointly optimized among all agents) and a personalized part (which is only locally optimized). Such a kind of partially personalized policy allows agents to coordinate by leveraging peers' experience and adapt to individual tasks. The flexibility in our design allows the proposed CAC algorithm to be used in a fully decentralized setting, where the agents can only communicate with their neighbors, as well as in a federated setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models. Theoretically, we show that under some standard regularity assumptions, the proposed CAC algorithm requires O(epsilon-5/2) samples to achieve an epsilon-stationary solution (defined as the solution whose squared norm of the gradient of the objective function is less than epsilon). To the best of our knowledge, this work provides the first finite-sample guarantee for decentralized AC algorithm with partially personalized policies.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Deployment Algorithm of Service Function Chain Based on Multi-Agent Soft Actor-Critic Learning
    Tang, Lun
    Li, Shirui
    Du, Yucong
    Chen, Qianbin
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (08) : 2893 - 2901
  • [22] Divergence-Regularized Multi-Agent Actor-Critic
    Su, Kefan
    Lu, Zongqing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [23] Multi-agent Attention Actor-Critic Algorithm for Load Balancing in Cellular Networks
    Kang, Jikun
    Wu, Di
    Wang, Ju
    Hossain, Ekram
    Liu, Xue
    Dedek, Gregory
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 5160 - 5165
  • [24] Toward Resilient Multi-Agent Actor-Critic Algorithms for Distributed Reinforcement Learning
    Lin, Yixuan
    Gade, Shripad
    Sandhu, Romeil
    Liu, Ji
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 3953 - 3958
  • [25] Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning
    Xiao, Yuchen
    Lyu, Xueguang
    Amato, Christopher
    2021 INTERNATIONAL SYMPOSIUM ON MULTI-ROBOT AND MULTI-AGENT SYSTEMS (MRS), 2021, : 155 - 163
  • [26] Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning
    Stankovic, Milos S.
    Beko, Marko
    Ilic, Nemanja
    Stankovic, Srdjan S.
    EUROPEAN JOURNAL OF CONTROL, 2023, 74
  • [27] Finite-Time Analysis of Single-Timescale Actor-Critic
    Chen, Xuyang
    Zhao, Lin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [28] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Ye, Zhenhui
    Chen, Yining
    Jiang, Xiaohong
    Song, Guanghua
    Yang, Bowei
    Fan, Sheng
    APPLIED INTELLIGENCE, 2022, 52 (04) : 3691 - 3704
  • [29] Multi-Agent Actor-Critic with Hierarchical Graph Attention Network
    Ryu, Heechang
    Shin, Hayong
    Park, Jinkyoo
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7236 - 7243
  • [30] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Zhenhui Ye
    Yining Chen
    Xiaohong Jiang
    Guanghua Song
    Bowei Yang
    Sheng Fan
    Applied Intelligence, 2022, 52 : 3691 - 3704