Temporal Inconsistency-Based Intrinsic Reward for Multi-Agent Reinforcement Learning

被引:0
|
作者
Sun, Shaoqi [1 ]
Xu, Kele [1 ]
机构
[1] Natl Univ Def Technol, Natl Key Lab Parallel & Distributed Proc, Changsha, Peoples R China
关键词
D O I
10.1109/IJCNN54540.2023.10191420
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent reinforcement learning (MARL) has shown promising results in many challenging sequential decision-making tasks. Recently, deep neural networks have dominated this field. However, the policy networks of agent's may fall into local optimum during the training phase, which severely constrains the performance of exploration. To address this issue, we propose a novel MARL learning framework named PSAM, which contains a new temporal inconsistency-based intrinsic reward and a diversity control strategy. Specifically, we save the parameters of the deep models along the optimization path of the agent's policy network, which can be denoted as snapshots. Through measuring the difference between snapshots, we can employ the difference as an intrinsic reward. Moreover, we propose a diversity control strategy to improve the performance further. Finally, to verify the effectiveness of the proposed method, we conduct extensive experiments in several widely used MARL environments. The results show that in many environments, PSAM can not only achieve state-of-the-art performance and prevent the policy network from getting stuck in local minima but also accelerate the agent's learning of the policy. It is worth noting that the proposed regularizer can be used using a plug-and-play manner without introducing any additional hyper-parameters and training costs.
引用
收藏
页数:7
相关论文
共 50 条
  • [11] On reward distribution in reinforcement learning of multi-agent surveillance systems with temporal logic specifications
    Terashima, Keita
    Kobayashi, Koichi
    Yamashita, Yuh
    ADVANCED ROBOTICS, 2024, 38 (06) : 386 - 397
  • [12] Autonomous learning of reward distribution for each agent in multi-agent reinforcement learning
    Shibata, K
    Ito, K
    INTELLIGENT AUTONOMOUS SYSTEMS 6, 2000, : 495 - 502
  • [13] Learning Cooperative Intrinsic Motivation in Multi-Agent Reinforcement Learning
    Hong, Seung-Jin
    Lee, Sang-Kwang
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1697 - 1699
  • [14] Emotion-Based Heterogeneous Multi-agent Reinforcement Learning with Sparse Reward
    Fang B.
    Ma Y.
    Wang Z.
    Wang H.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (03): : 223 - 231
  • [15] Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion
    Zhang, Bo-Kun
    Hu, Bin
    Chen, Long
    Zhang, Ding-Xue
    Cheng, Xin-Ming
    Guan, Zhi-Hong
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3352 - 3357
  • [16] Reward design for multi-agent reinforcement learning with a penalty based on the payment mechanism
    Matsunami N.
    Okuhara S.
    Ito T.
    Transactions of the Japanese Society for Artificial Intelligence, 2021, 36 (05)
  • [17] Extrinsic-and-Intrinsic Reward-Based Multi-Agent Reinforcement Learning for Multi-UAV Cooperative Target Encirclement
    Chen, Jinchao
    Wang, Yang
    Zhang, Ying
    Lu, Yantao
    Shu, Qiuhao
    Hu, Yujiao
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
  • [18] Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward
    Shao, Kun
    Zhu, Yuanheng
    Tang, Zhentao
    Zhao, Dongbin
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [19] Explainable Multi-Agent Reinforcement Learning for Temporal Queries
    Boggess, Kayla
    Kraus, Sarit
    Feng, Lu
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 55 - 63
  • [20] Collective Intrinsic Motivation of a Multi-agent System Based on Reinforcement Learning Algorithms
    Bolshakov, Vladislav
    Sakulin, Sergey
    Alfimtsev, Alexander
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 4, INTELLISYS 2023, 2024, 825 : 655 - 670