Special Agents Policy Gradient In Value Decomposition-based Approach

被引:0
|
作者
Kang, Qitong [1 ,2 ]
Wang, Fuyong [1 ,2 ]
Liu, Zhongxin [1 ,2 ]
Chen, Zengqiang [1 ,2 ]
机构
[1] Nankai Univ, Coll Artificial Intelligence, Tianjin 300350, Peoples R China
[2] Nankai Univ, Tianjin Key Lab Brain Sci & Intelligent Rehabil, Tianjin 300350, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-agent; Reinforcement Learning; Deep Learning; Policy Gradient; Value Decomposition-based;
D O I
10.1109/DDCLS58216.2023.10165847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many real-world environments, such as soldiers and general in a battlefield, or teammates and goalkeeper in a soccer field, the "general" has a significantly stronger role than the "soldier", so that it is logical to assign higher "intelligence" and "flexibility" to the "general", we define it as special agent. Here, we propose a multi-agent reinforcement learning algorithm that provides stronger intelligence to special agent in a fully cooperative heterogeneous multi-agent environment. Similar to QMIX, we design a common monotonicity critic for all agents, but a separate actor network to improve its "intelligence" for the special agent. In this way we can improve the group's ability to cooperate by giving special agent greater ability, while ensuring that the group remains cooperative. We evaluate the above algorithm on two sets of StarCraft 2 micromanagement tasks, and the experimental results show that the algorithm has a significant advantage over baseline algorithms for tasks with significant heterogeneity.
引用
收藏
页码:1387 / 1391
页数:5
相关论文
共 50 条
  • [31] Decomposition-based Gradient Estimation Algorithms for Multivariable Equation-error Systems
    Lu, Xian
    Ding, Feng
    Alsaedi, Ahmed
    Hayat, Tasawar
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2019, 17 (08) : 2037 - 2045
  • [32] A Decomposition-Based Approach for the Multiperiod Multiproduct Distribution Planning Problem
    Hosseini, S. Ahmad
    Sahin, Guvenc
    Unluyurt, Tonguc
    JOURNAL OF APPLIED MATHEMATICS, 2014,
  • [33] A singular value decomposition-based method for solving a deterministic adaptive problem
    Park, S
    Sarkar, TK
    Hua, YB
    DIGITAL SIGNAL PROCESSING, 1999, 9 (01) : 57 - 63
  • [34] Adaptive decomposition-based evolutionary approach for multiobjective sparse reconstruction
    Yan, Bai
    Zhao, Qi
    Wang, Zhihai
    Zhang, J. Andrew
    INFORMATION SCIENCES, 2018, 462 : 141 - 159
  • [35] A Benders decomposition-based approach for logistics service network design
    Belieres, Simon
    Hewitt, Mike
    Jozefowiez, Nicolas
    Semet, Frederic
    Van Woensel, Tom
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 286 (02) : 523 - 537
  • [36] A Decomposition-based Heuristic Approach to Solve General Delivery Problem
    Lian, L.
    Castelain, E.
    WCECS 2009: WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, VOLS I AND II, 2009, : 1078 - 1082
  • [37] A singular value decomposition-based technique for decoupling and analyzing power networks
    Pordanjani, Iraj Rahimi
    Xu, Wilsun
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2016, 74 : 265 - 273
  • [38] Decomposition-Based Wiener Filter Using the Kronecker Product and Conjugate Gradient Method
    Stanciu, Cristian-Lucian
    Benesty, Jacob
    Paleologu, Constantin
    Costea, Ruxandra-Liana
    Dogariu, Laura-Maria
    Ciochina, Silviu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 124 - 138
  • [39] Hierarchical decomposition-based underwater image enhancement network with auxiliary gradient guidance
    Xie, Jing
    Deng, Xing
    Shao, Haijian
    Jiang, Yingtao
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (05)
  • [40] A decomposition-based approach for service composition with global QoS guarantees
    Sun, Sherry X.
    Zhao, Jing
    INFORMATION SCIENCES, 2012, 199 : 138 - 153