Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

被引:0
|
作者
Zeng, Siliang [1 ]
Chen, Tianyi [2 ]
Garcia, Alfredo [3 ]
Hong, Mingyi [1 ]
机构
[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA
[2] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12181 USA
[3] Texas A&M Univ, Dept Ind & Syst Engn, College Stn, TX 77843 USA
关键词
Multi-Agent Reinforcement Learning; Actor-Critic; Parameter Sharing; OPTIMIZATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
(1)Multi-agent reinforcement learning (MARL) has attracted much research attention recently. However, unlike its single-agent counterpart, many theoretical and algorithmic aspects of MARL have not been well-understood. In this paper, we study the emergence of coordinated behavior by autonomous agents using an actor-critic (AC) algorithm. Specifically, we propose and analyze a class of coordinated actor-critic (CAC) algorithms in which individually parametrized policies have a shared part (which is jointly optimized among all agents) and a personalized part (which is only locally optimized). Such a kind of partially personalized policy allows agents to coordinate by leveraging peers' experience and adapt to individual tasks. The flexibility in our design allows the proposed CAC algorithm to be used in a fully decentralized setting, where the agents can only communicate with their neighbors, as well as in a federated setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models. Theoretically, we show that under some standard regularity assumptions, the proposed CAC algorithm requires O(epsilon-5/2) samples to achieve an epsilon-stationary solution (defined as the solution whose squared norm of the gradient of the objective function is less than epsilon). To the best of our knowledge, this work provides the first finite-sample guarantee for decentralized AC algorithm with partially personalized policies.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Dynamic Spectrum Sharing Based on Federated Learning and Multi-Agent Actor-Critic Reinforcement Learning
    Yang, Tongtong
    Zhang, Wensheng
    Bo, Yulian
    Sun, Jian
    Wang, Cheng-Xiang
    2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 947 - 952
  • [32] Actor-Critic Neural Network Based Finite-time Control for Uncertain Robotic Systems
    Lei, Changyi
    5TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND DATA MINING (ICISDM 2021), 2021, : 34 - 40
  • [33] Multi-agent Gradient-Based Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
    Ren, Jineng
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [34] Privacy-Preserving Decentralized Actor-Critic for Cooperative Multi-Agent Reinforcement Learning
    Ahmed, Maheed H.
    Ghasemi, Mahsa
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [35] PRACM: Predictive Rewards for Actor-Critic with Mixing Function in Multi-Agent Reinforcement Learning
    Yu, Sheng
    Liu, Bo
    Zhu, Wei
    Liu, Shuhong
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2023, 2023, 14120 : 69 - 82
  • [36] Multi-Agent Actor-Critic Multitask Reinforcement Learning based on GTD(1) with Consensus
    Stankovic, Milo S. S.
    Beko, Marko
    Ilic, Nemanja
    Stankovic, Srdjan S.
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4591 - 4596
  • [37] An Object Oriented Approach to Fuzzy Actor-Critic Learning for Multi-Agent Differential Games
    Schwartz, Howard
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 183 - 190
  • [38] Multi-agent Actor-Critic Reinforcement Learning Based In-network Load Balance
    Mai, Tianle
    Yao, Haipeng
    Xiong, Zehui
    Guo, Song
    Niyato, Dusit Tao
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [39] Optimal Consensus Control for Continuous-time Multi-agent Systems via Actor-Critic Neural Networks
    Jia, Xiao
    Wolter, Katinka
    2022 8TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2022), 2022, : 191 - 195
  • [40] Capacity-Limited Decentralized Actor-Critic for Multi-Agent Games
    Malloy, Tyler
    Sims, Chris R.
    Klinger, Tim
    Liu, Miao
    Riemer, Matthew
    Tesauro, Gerald
    2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 332 - 339