Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引:1
|
作者
Zhang, Bo-Kun [1 ]
Hu, Bin [1 ]
Chen, Long [1 ]
Zhang, Ding-Xue [2 ]
Cheng, Xin-Ming [3 ]
Guan, Zhi-Hong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China
关键词
Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;
D O I
10.1109/CCDC52312.2021.9601771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.
引用
收藏
页码:3352 / 3357
页数:6
相关论文
共 50 条
  • [21] Temporal Inconsistency-Based Intrinsic Reward for Multi-Agent Reinforcement Learning
    Sun, Shaoqi
    Xu, Kele
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [22] Reward design for multi-agent reinforcement learning with a penalty based on the payment mechanism
    Matsunami N.
    Okuhara S.
    Ito T.
    Transactions of the Japanese Society for Artificial Intelligence, 2021, 36 (05)
  • [23] LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning
    Du, Yali
    Han, Lei
    Fang, Meng
    Dai, Tianhong
    Liu, Ji
    Tao, Dacheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [24] Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning
    de Souza, Cristino, Jr.
    Newbury, Rhys
    Cosgun, Akansel
    Castillo, Pedro
    Vidolov, Boris
    Kulic, Dana
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (03): : 4552 - 4559
  • [25] Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward
    Shao, Kun
    Zhu, Yuanheng
    Tang, Zhentao
    Zhao, Dongbin
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [26] Learning-Based Metareasoning for Decision Making in Multi-Agent Pursuit-Evasion Games
    Namala, Prannoy
    Herrmann, Jeffrey W.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS VI, 2024, 13051
  • [27] Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning
    Mannion, Patrick
    Devlin, Sam
    Duggan, Jim
    Howley, Enda
    KNOWLEDGE ENGINEERING REVIEW, 2018, 33
  • [28] Impairment of Probabilistic Reward-Based Learning in Schizophrenia
    Weiler, Julia A.
    Bellebaum, Christian
    Bruene, Martin
    Juckel, Georg
    Daum, Irene
    NEUROPSYCHOLOGY, 2009, 23 (05) : 571 - 580
  • [29] Multi-Agent Pursuit-Evasion Game Based on Organizational Architecture
    Souidi M.E.H.
    Siam A.
    Pei Z.
    Piao S.
    Journal of Computing and Information Technology, 2019, 27 (01) : 1 - 12
  • [30] Using Cognitive Behavioral Learning in Multi-Agent Pursuit-Evasion Game
    Kuo, Jong Yih
    Liu, Chien-Hung
    Lee, Fang-Wen
    ASIA MODELLING SYMPOSIUM 2014 (AMS 2014), 2014, : 16 - 20