Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引:1
|
作者
Zhang, Bo-Kun [1 ]
Hu, Bin [1 ]
Chen, Long [1 ]
Zhang, Ding-Xue [2 ]
Cheng, Xin-Ming [3 ]
Guan, Zhi-Hong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China
关键词
Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;
D O I
10.1109/CCDC52312.2021.9601771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.
引用
收藏
页码:3352 / 3357
页数:6
相关论文
共 50 条
  • [31] Decentralized graph-based multi-agent reinforcement learning using reward machines
    Hu, Jueming
    Xu, Zhe
    Wang, Weichang
    Qu, Guannan
    Pang, Yutian
    Liu, Yongming
    NEUROCOMPUTING, 2024, 564
  • [32] Multi-agent reinforcement learning based on self-satisfaction in sparse reward scenarios
    Fang, Baofu
    Tang, Dandan
    Wang, Zaijun
    Wang, Hao
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2025, 25 (01)
  • [33] Reward-Filtering-Based Credit Assignment for Multi-Agent Deep Reinforcement Learning
    Xu C.
    Yin N.
    Duan S.-H.
    He H.
    Wang R.
    Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (11): : 2306 - 2320
  • [34] A Novel Method Combining Leader-Following Control and Reinforcement Learning for Pursuit Evasion Games of Multi-Agent Systems
    Zhu, Zhe-Yang
    Liu, Cheng-Lin
    16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 166 - 171
  • [35] Reward-Poisoning Attacks on Offline Multi-Agent Reinforcement Learning
    Wu, Young
    McMahan, Jeremy
    Zhu, Xiaojin
    Xie, Qiaomin
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10426 - 10434
  • [36] Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning
    Zhang, Tianle
    Liu, Zhen
    Wu, Shiguang
    Pu, Zhiqiang
    Yi, Jianqiang
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [37] Distributional Reward Estimation for Effective Multi-Agent Deep Reinforcement Learning
    Hu, Jifeng
    Sun, Yanchao
    Chen, Hechang
    Huang, Sili
    Piao, Haiyin
    Chang, Yi
    Sun, Lichao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [38] Reward design for driver repositioning using multi-agent reinforcement learning
    Shou, Zhenyu
    Di, Xuan
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 119
  • [39] Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward
    Sheikh, Hassam Ullah
    Boloni, Ladislau
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [40] Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
    Qu, Guannan
    Lin, Yiheng
    Wierman, Adam
    Li, Na
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33