SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

被引：0

作者：

Wen, Chao ^{[1
]}

Yao, Xinghu ^{[1
]}

Wang, Yuhui ^{[1
]}

Tan, Xiaoyang ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, MIIT Key Lab Pattern Anal & Machine Intelligence, Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 211106, Peoples R China

来源：

THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2020年 / 34卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work presents a sample efficient and effective value-based method, named SMIX(lambda), for reinforcement learning in multi-agent environments (MARL) within the paradigm of centralized training with decentralized execution (CTDE), in which learning a stable and generalizable centralized value function (CVF) is crucial. To achieve this, our method carefully combines different elements, including 1) removing the unrealistic centralized greedy assumption during the learning phase, 2) using the lambda-return to balance the trade-off between bias and variance and to deal with the environment's non-Markovian property, and 3) adopting an experience-replay style off-policy training. Interestingly, it is revealed that there exists inherent connection between SMIX(lambda) and previous off-policy Q(lambda) approach for single-agent learning. Experiments on the StarCraft Multi-Agent Challenge (SMAC) benchmark show that the proposed SMIX(lambda) algorithm outperforms several state-of-the-art MARL methods by a large margin, and that it can be used as a general tool to improve the overall performance of a CTDE-type method by enhancing the evaluation quality of its CVF. We open-source our code at: https://github.com/chaovven/SMIX.

引用

页码：7301 / 7308

页数：8

共 50 条

[41] Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning
Li, Jiahui
Kuang, Kun
Wang, Baoxiang
Liu, Furui
Chen, Long
Fan, Changjie
Wu, Fei
Xiao, Jun
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[42] Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
Zimmer, Matthieu
Glanois, Claire
Siddique, Umer
Weng, Paul
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[43] Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
Zhou, Meng
Liu, Ziyu
Sui, Pengwei
Li, Yixuan
Chung, Yuk Ying
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[44] QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement learning
Son, Kyunghwan
Kim, Daewoo
Kang, Wan Ju
Hostallero, David
Yi, Yung
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[45] Transform networks for cooperative multi-agent deep reinforcement learning
Hongbin Wang
Xiaodong Xie
Lianke Zhou
Applied Intelligence, 2023, 53 : 9261 - 9269
[46] Cooperative Multi-Agent Deep Reinforcement Learning in Soccer Domains
Ocana, Jim Martin Catacora
Riccio, Francesco
Capobianco, Roberto
Nardi, Daniele
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1865 - 1867
[47] Reinforcement Learning Approach for Cooperative Control of Multi-Agent Systems
Javalera-Rincon, Valeria
Puig Cayuela, Vicenc
Morcego Seix, Bernardo
Orduna-Cabrera, Fernando
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 80 - 91
[48] Cooperative targets assignment based on multi-agent reinforcement learning
Ma Y.
Wu L.
Xu X.
Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2023, 45 (09): : 2793 - 2801
[49] Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning
Mu, Ronghui
Ruan, Wenjie
Marcolino, Leandro Soriano
Jin, Gaojie
Ni, Qiang
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 15046 - 15054
[50] Transform networks for cooperative multi-agent deep reinforcement learning
Wang, Hongbin
Xie, Xiaodong
Zhou, Lianke
APPLIED INTELLIGENCE, 2023, 53 (08) : 9261 - 9269

← 1 2 3 4 5 →