Special Agents Policy Gradient In Value Decomposition-based Approach

被引:0
|
作者
Kang, Qitong [1 ,2 ]
Wang, Fuyong [1 ,2 ]
Liu, Zhongxin [1 ,2 ]
Chen, Zengqiang [1 ,2 ]
机构
[1] Nankai Univ, Coll Artificial Intelligence, Tianjin 300350, Peoples R China
[2] Nankai Univ, Tianjin Key Lab Brain Sci & Intelligent Rehabil, Tianjin 300350, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-agent; Reinforcement Learning; Deep Learning; Policy Gradient; Value Decomposition-based;
D O I
10.1109/DDCLS58216.2023.10165847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many real-world environments, such as soldiers and general in a battlefield, or teammates and goalkeeper in a soccer field, the "general" has a significantly stronger role than the "soldier", so that it is logical to assign higher "intelligence" and "flexibility" to the "general", we define it as special agent. Here, we propose a multi-agent reinforcement learning algorithm that provides stronger intelligence to special agent in a fully cooperative heterogeneous multi-agent environment. Similar to QMIX, we design a common monotonicity critic for all agents, but a separate actor network to improve its "intelligence" for the special agent. In this way we can improve the group's ability to cooperate by giving special agent greater ability, while ensuring that the group remains cooperative. We evaluate the above algorithm on two sets of StarCraft 2 micromanagement tasks, and the experimental results show that the algorithm has a significant advantage over baseline algorithms for tasks with significant heterogeneity.
引用
收藏
页码:1387 / 1391
页数:5
相关论文
共 50 条
  • [1] A decomposition-based approach to layered manufacturing
    Ilinkin, I
    Janardan, R
    Majhi, J
    Schwerdt, J
    Smid, M
    Sriram, R
    COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2002, 23 (02): : 117 - 151
  • [2] A decomposition-based approach to layered manufacturing
    Ilinkin, I
    Janardan, R
    Majhi, J
    Schwerdt, J
    Smid, M
    Sriram, R
    ALGORITHMS AND DATA STRUCTURES, 2001, 2125 : 389 - 400
  • [3] Singular value decomposition-based gait characterization
    Guzelbulut, Cem
    Suzuki, Katsuyuki
    Shimono, Satoshi
    HELIYON, 2022, 8 (12)
  • [4] Singular Value Decomposition-Based Alternative Splicing Detection
    Hu, Jianhua
    He, Xuming
    Cote, Gilbert J.
    Krahe, Ralf
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (487) : 944 - 953
  • [5] Singular value decomposition-based personalized review recommendation
    Yu, Gang
    Wang, Zhi-Yan
    Shao, Lu
    Hu, Shu-Yue
    Cai, Yi
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2015, 44 (04): : 605 - 610
  • [6] A Decomposition-Based Approach to Spreadsheet Testing and Debugging
    Schmitz, Thomas
    Jannach, Dietmar
    Hofer, Birgit
    Koch, Patrick
    Schekotihin, Konstantin
    Wotawa, Franz
    2017 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING (VL/HCC), 2017, : 117 - 121
  • [7] On the Enumeration of Association Rules: A Decomposition-based Approach
    Izza, Yacine
    Jabbour, Said
    Raddaoui, Badran
    Boudane, Abdelahmid
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1265 - 1271
  • [8] Inequality of opportunity in health: A decomposition-based approach
    Carrieri, Vincenzo
    Jones, Andrew M.
    HEALTH ECONOMICS, 2018, 27 (12) : 1981 - 1995
  • [9] Singular value decomposition-based illumination compensation in video
    Lee, Ki-Youn
    Park, Rae-Hong
    Advances in Visual Computing, Pt 1, 2006, 4291 : 313 - 322
  • [10] Decomposition-based multiinnovation gradient identification algorithms for a special bilinear system based on its input-output representation
    Wang, Longjin
    Ji, Yan
    Yang, Hualin
    Xu, Ling
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2020, 30 (09) : 3607 - 3623