Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引:1
|
作者
Zhang, Bo-Kun [1 ]
Hu, Bin [1 ]
Chen, Long [1 ]
Zhang, Ding-Xue [2 ]
Cheng, Xin-Ming [3 ]
Guan, Zhi-Hong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China
关键词
Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;
D O I
10.1109/CCDC52312.2021.9601771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.
引用
收藏
页码:3352 / 3357
页数:6
相关论文
共 50 条
  • [41] Reinforcement learning based on multi-agent in RoboCup
    Zhang, W
    Li, JG
    Ruan, XG
    ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 967 - 975
  • [42] Reward Function Design Method for Long Episode Pursuit Tasks Under Polar Coordinate in Multi-Agent Reinforcement Learning
    Dong Y.
    Cui T.
    Zhou Y.
    Song X.
    Zhu Y.
    Dong P.
    Journal of Shanghai Jiaotong University (Science), 2024, 29 (04) : 646 - 655
  • [43] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [44] Distributed multi-agent deep reinforcement learning for cooperative multi-robot pursuit
    Yu, Chao
    Dong, Yinzhao
    Li, Yangning
    Chen, Yatong
    JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 499 - 504
  • [45] Mobile User Interface Adaptation Based on Usability Reward Model and Multi-Agent Reinforcement Learning
    Vidmanov, Dmitry
    Alfimtsev, Alexander
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2024, 8 (04)
  • [46] An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning
    Wan, Kaifang
    Wu, Dingwei
    Zhai, Yiwei
    Li, Bo
    Gao, Xiaoguang
    Hu, Zijian
    ENTROPY, 2021, 23 (11)
  • [47] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Lau, Hoong Chuin
    Zilberstein, Shlomo
    Zhang, Chongjie
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
  • [48] Multi-Agent Deep Reinforcement Learning With Progressive Negative Reward for Cryptocurrency Trading
    Kumlungmak, Kittiwin
    Vateekul, Peerapon
    IEEE ACCESS, 2023, 11 : 66440 - 66455
  • [49] Leaders and Collaborators: Addressing Sparse Reward Challenges in Multi-Agent Reinforcement Learning
    Sun, Shaoqi
    Liu, Hui
    Xu, Kele
    Ding, Bo
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [50] Role differentiation process by division of reward function in multi-agent reinforcement learning
    Taniguchi, Tadahiro
    Tabuchi, Kazuma
    Sawaragi, Tetsuo
    2008 PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-7, 2008, : 358 - +