SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

被引:0
|
作者
Wen, Chao [1 ]
Yao, Xinghu [1 ]
Wang, Yuhui [1 ]
Tan, Xiaoyang [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, MIIT Key Lab Pattern Anal & Machine Intelligence, Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 211106, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work presents a sample efficient and effective value-based method, named SMIX(lambda), for reinforcement learning in multi-agent environments (MARL) within the paradigm of centralized training with decentralized execution (CTDE), in which learning a stable and generalizable centralized value function (CVF) is crucial. To achieve this, our method carefully combines different elements, including 1) removing the unrealistic centralized greedy assumption during the learning phase, 2) using the lambda-return to balance the trade-off between bias and variance and to deal with the environment's non-Markovian property, and 3) adopting an experience-replay style off-policy training. Interestingly, it is revealed that there exists inherent connection between SMIX(lambda) and previous off-policy Q(lambda) approach for single-agent learning. Experiments on the StarCraft Multi-Agent Challenge (SMAC) benchmark show that the proposed SMIX(lambda) algorithm outperforms several state-of-the-art MARL methods by a large margin, and that it can be used as a general tool to improve the overall performance of a CTDE-type method by enhancing the evaluation quality of its CVF. We open-source our code at: https://github.com/chaovven/SMIX.
引用
收藏
页码:7301 / 7308
页数:8
相关论文
共 50 条
  • [31] Privacy-Engineered Value Decomposition Networks for Cooperative Multi-Agent Reinforcement Learning
    Gohari, Parham
    Hale, Matthew
    Topcu, Ufuk
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 8038 - 8044
  • [32] Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning
    Castellini, Jacopo
    Oliehoek, Frans A.
    Savani, Rahul
    Whiteson, Shimon
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2021, 35 (02)
  • [33] QDN: An Efficient Value Decomposition Method for Cooperative Multi-agent Deep Reinforcement Learning
    Xie, Zaipeng
    Zhang, Yufeng
    Shao, Pengfei
    Zhao, Weiyi
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 1204 - 1211
  • [34] Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning
    Zohar, Roy
    Mannor, Shie
    Tennenholtz, Guy
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9278 - 9285
  • [35] QVF: Incorporating quantile value function factorization into cooperative multi-agent reinforcement learning
    Huang, Anqi
    Wang, Yongli
    Liu, Ruoze
    Zou, Haochen
    Zhou, Xiaoliang
    PATTERN RECOGNITION, 2025, 161
  • [36] CLlight: Enhancing representation of multi-agent reinforcement learning with contrastive learning for cooperative traffic signal control
    Fu, Xiang
    Ren, Yilong
    Jiang, Han
    Lv, Jiancheng
    Cui, Zhiyong
    Yu, Haiyang
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 262
  • [37] Enhancing cooperative multi-agent reinforcement learning through the integration of R-STDP and federated learning
    Ramezanlou, Mohammad Tayefe
    Schwartz, Howard
    Lambadaris, Ioannis
    Barbeau, Michel
    NEUROCOMPUTING, 2025, 617
  • [38] A centralized reinforcement learning method for multi-agent job scheduling in Grid
    Moradi, Milad
    2016 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2016, : 171 - 176
  • [39] Pacesetter Learning for Large Scale Cooperative Multi-Agent Reinforcement Learning
    Zhou, Pingqi
    Li, Chao
    Qiu, Mengwei
    Liu, Jun
    Ma, Chennan
    Yan, Ming
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 115 - 126
  • [40] Learning Distinct Strategies for Heterogeneous Cooperative Multi-agent Reinforcement Learning
    Wan, Kejia
    Xu, Xinhai
    Li, Yuan
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 544 - 555