Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward

被引:2
|
作者
Shao, Kun [1 ,2 ]
Zhu, Yuanheng [1 ]
Tang, Zhentao [1 ,2 ]
Zhao, Dongbin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
关键词
reinforcement learning; deep reinforcement learning; cooperative games; counterfactual reward; LEVEL; GAME; GO;
D O I
10.1109/ijcnn48605.2020.9207169
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In partially observable fully cooperative games, agents generally tend to maximize global rewards with joint actions, so it is difficult for each agent to deduce their own contribution. To address this credit assignment problem, we propose a multi-agent reinforcement learning algorithm with counterfactual reward mechanism, which is termed as CoRe algorithm. CoRe computes the global reward difference in condition that the agent does not take its actual action but takes other actions, while other agents fix their actual actions. This approach can determine each agent's contribution for the global reward. We evaluate CoRe in a simplified Pig Chase game with a decentralised Deep Q Network (DQN) framework. The proposed method helps agents learn end-to-end collaborative behaviors. Compared with other DQN variants with global reward, CoRe significantly improves learning efficiency and achieves better results. In addition, CoRe shows excellent performances in various size game environments.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Learning Reward Machines in Cooperative Multi-agent Tasks
    Ardon, Leo
    Furelos-Blanco, Daniel
    Russo, Alessandra
    AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS. BEST AND VISIONARY PAPERS, AAMAS 2023 WORKSHOPS, 2024, 14456 : 43 - 59
  • [32] The Cooperative Multi-agent Learning with Random Reward Values
    张化祥
    黄上腾
    Journal of Shanghai Jiaotong University, 2005, (02) : 147 - 150
  • [33] Distributed multi-agent deep reinforcement learning for cooperative multi-robot pursuit
    Yu, Chao
    Dong, Yinzhao
    Li, Yangning
    Chen, Yatong
    JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 499 - 504
  • [34] Learning MAC Protocols in HetNets: A Cooperative Multi-Agent Deep Reinforcement Learning Approach
    Naeem, Faisal
    Adam, Nadir
    Kaddoum, Georges
    Waqar, Omer
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [35] Multi-Agent Deep Reinforcement Learning for Cooperative Driving in Crowded Traffic Scenarios
    Park, Jongwon
    Min, Kyushik
    Huh, Kunsoo
    2019 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2019,
  • [36] Cooperative Multi-Agent Deep Reinforcement Learning for Dynamic Virtual Network Allocation
    Suzuki, Akito
    Kawahara, Ryoichi
    Harada, Shigeaki
    30TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2021), 2021,
  • [37] Multi-Agent Deep Reinforcement Learning for Decentralized Cooperative Traffic Signal Control
    Zhao, Yang
    Hu, Jian-Ming
    Gao, Ming-Yang
    Zhang, Zuo
    CICTP 2020: TRANSPORTATION EVOLUTION IMPACTING FUTURE MOBILITY, 2020, : 458 - 470
  • [38] Multi-agent deep reinforcement learning for computation offloading in cooperative edge network
    Wu, Pengju
    Guan, Yepeng
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, : 567 - 591
  • [39] Autonomous learning of reward distribution for each agent in multi-agent reinforcement learning
    Shibata, K
    Ito, K
    INTELLIGENT AUTONOMOUS SYSTEMS 6, 2000, : 495 - 502
  • [40] Decentralized Counterfactual Value with Threat Detection for Multi-Agent Reinforcement Learning in mixed cooperative and competitive environments
    Dong, Shaokang
    Li, Chao
    Yang, Shangdong
    Li, Wenbin
    Gao, Yang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 257