Hierarchical Multiagent Formation Control Scheme via Actor-Critic Learning

被引：17

作者：

Mu, Chaoxu ^{[1
]}

Peng, Jiangwen ^{[1
]}

Sun, Changyin ^{[2
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Convergence; Hybrid fiber coaxial cables; Heuristic algorithms; Games; Dynamic programming; Microgrids; Computational complexity; Adaptive dynamic programming (ADP); hierarchical formation control (HFC); multiagent system (MAS); multistep generalized policy iteration (MsGPI); neural networks (NNs); GROUP CONSENSUS; SYSTEMS SUBJECT;

D O I：

10.1109/TNNLS.2022.3153028

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article presents a nearly optimal solution to the cooperative formation control problem for large-scale multiagent system (MAS). First, multigroup technique is widely used for the decomposition of the large-scale problem, but there is no consensus between different subgroups. Inspired by the hierarchical structure applied in the MAS, a hierarchical leader-following formation control structure with multigroup technique is constructed, where two layers and three types of agents are designed. Second, adaptive dynamic programming technique is conformed to the optimal formation control problem by the establishment of performance index function. Based on the traditional generalized policy iteration (PI) algorithm, the multistep generalized policy iteration (MsGPI) is developed with the modification of policy evaluation. The novel algorithm not only inherits the advantages of high convergence speed and low computational complexity in the generalized PI algorithm but also further accelerates the convergence speed and reduces run time. Besides, the stability analysis, convergence analysis, and optimality analysis are given for the proposed multistep PI algorithm. Afterward, a neural network-based actor-critic structure is built for approximating the iterative control policies and value functions. Finally, a large-scale formation control problem is provided to demonstrate the performance of our developed hierarchical leader-following formation control structure and MsGPI algorithm.

引用

页码：8764 / 8777

页数：14

共 50 条

[1] Curious Hierarchical Actor-Critic Reinforcement Learning
Roeder, Frank
Eppe, Manfred
Nguyen, Phuong D. H.
Wermter, Stefan
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 408 - 419
[2] Actor-Critic learning hierarchical sliding mode control for a class of underactuated systems
Liu, Wei
Chen, Siyi
Huang, Huixian
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1 - 6
[3] USING ACTOR-CRITIC REINFORCEMENT LEARNING FOR CONTROL AND FLIGHT FORMATION OF QUADROTORS
Torres, Edgar
Xu, Lei
Sardarmehni, Tohid
PROCEEDINGS OF ASME 2022 INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION, IMECE2022, VOL 5, 2022,
[4] Speed Tracking Control via Online Continuous Actor-Critic learning
Huang, Zhenhua
Xu, Xin
Sun, Zhenping
Tan, Jun
Qian, Lilin
PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 3172 - 3177
[5] Federated Multiagent Actor-Critic Learning Task Offloading in Intelligent Logistics
Li, Qiqi
Cui, Yaping
Song, Tao
Zheng, Linjiang
IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (13) : 11696 - 11707
[6] Twin Delayed Hierarchical Actor-Critic
Anca, Mihai
Studley, Matthew
2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 221 - 225
[7] Improving Multiagent Actor-Critic Architectures, with Opponent Approximation and Dropout for Control
Paczolay, Gabor
Harmati, Istvan
ACTA POLYTECHNICA HUNGARICA, 2024, 21 (04) : 233 - 252
[8] Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
Zhong, Shan
Liu, Quan
Fu, QiMing
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
[9] An Actor-Critic Hierarchical Reinforcement Learning Model for Course Recommendation
Liang, Kun
Zhang, Guoqiang
Guo, Jinhui
Li, Wentao
ELECTRONICS, 2023, 12 (24)
[10] Efficient Model Learning Methods for Actor-Critic Control
Grondman, Ivo
Vaandrager, Maarten
Busoniu, Lucian
Babuska, Robert
Schuitema, Erik
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (03): : 591 - 602

← 1 2 3 4 5 →