Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引：1

作者：

Zhang, Bo-Kun ^{[1
]}

Hu, Bin ^{[1
]}

Chen, Long ^{[1
]}

Zhang, Ding-Xue ^{[2
]}

Cheng, Xin-Ming ^{[3
]}

Guan, Zhi-Hong ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China

[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China

[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China

来源：

PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021) | 2021年

关键词：

Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;

D O I：

10.1109/CCDC52312.2021.9601771

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.

引用

页码：3352 / 3357

页数：6

共 50 条

[31] Decentralized graph-based multi-agent reinforcement learning using reward machines
Hu, Jueming
Xu, Zhe
Wang, Weichang
Qu, Guannan
Pang, Yutian
Liu, Yongming
NEUROCOMPUTING, 2024, 564
[32] Multi-agent reinforcement learning based on self-satisfaction in sparse reward scenarios
Fang, Baofu
Tang, Dandan
Wang, Zaijun
Wang, Hao
INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2025, 25 (01)
[33] Reward-Filtering-Based Credit Assignment for Multi-Agent Deep Reinforcement Learning
Xu C.
Yin N.
Duan S.-H.
He H.
Wang R.
Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (11): : 2306 - 2320
[34] A Novel Method Combining Leader-Following Control and Reinforcement Learning for Pursuit Evasion Games of Multi-Agent Systems
Zhu, Zhe-Yang
Liu, Cheng-Lin
16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 166 - 171
[35] Reward-Poisoning Attacks on Offline Multi-Agent Reinforcement Learning
Wu, Young
McMahan, Jeremy
Zhu, Xiaojin
Xie, Qiaomin
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10426 - 10434
[36] Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning
Zhang, Tianle
Liu, Zhen
Wu, Shiguang
Pu, Zhiqiang
Yi, Jianqiang
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[37] Distributional Reward Estimation for Effective Multi-Agent Deep Reinforcement Learning
Hu, Jifeng
Sun, Yanchao
Chen, Hechang
Huang, Sili
Piao, Haiyin
Chang, Yi
Sun, Lichao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[38] Reward design for driver repositioning using multi-agent reinforcement learning
Shou, Zhenyu
Di, Xuan
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 119
[39] Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward
Sheikh, Hassam Ullah
Boloni, Ladislau
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[40] Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
Qu, Guannan
Lin, Yiheng
Wierman, Adam
Li, Na
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33

← 1 2 3 4 5 →