Optimal consensus control for multi-agent systems: Multi-step policy gradient adaptive dynamic programming method

被引:4
|
作者
Ji, Lianghao [1 ,3 ]
Jian, Kai [1 ]
Zhang, Cuijuan [1 ]
Yang, Shasha [1 ]
Guo, Xing [1 ]
Li, Huaqing [2 ]
机构
[1] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing, Peoples R China
[2] Southwest Univ, Coll Elect & Informat Engn, Chongqing, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
来源
IET CONTROL THEORY AND APPLICATIONS | 2023年 / 17卷 / 11期
基金
中国国家自然科学基金;
关键词
complex networks; dynamic programming; intelligent control; multi-agent systems; optimal control; OPTIMAL TRACKING CONTROL; ALGORITHM; FRAMEWORK;
D O I
10.1049/cth2.12473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel adaptive dynamic programming (ADP) method to solve the optimal consensus problem for a class of discrete-time multi-agent systems with completely unknown dynamics. Different from the classical RL-based optimal control algorithms based on one-step temporal difference method, a multi-step-based (also call n-step) policy gradient ADP (MS-PGADP) algorithm, which have been proved to be more efficient owing to its faster propagation of the reward, is proposed to obtain the iterative control policies. Moreover, a novel Q-function is defined, which estimates the performance of performing an action in the current state. Then, through the Lyapunov stability theorem and functional analysis, the proof of optimality of the performance index function is given and the stability of the error system is also proved. Furthermore, the actor-critic neural networks are used to implement the proposed method. Inspired by deep Q network, the target network is also introduced to guarantee the stability of NNs in the process of training. Finally, two simulations are conducted to verify the effectiveness of the proposed algorithm.
引用
收藏
页码:1443 / 1457
页数:15
相关论文
共 50 条
  • [41] Consensus Control of Linear Multi-Agent Systems with Distributed Adaptive Protocols
    Li, Zhongkui
    Liu, Xiangdong
    Ren, Wei
    Xie, Lihua
    2012 AMERICAN CONTROL CONFERENCE (ACC), 2012, : 1573 - 1578
  • [42] Internal reinforcement adaptive dynamic programming for optimal containment control of unknown continuous-time multi-agent systems
    Zhang, Jiefu
    Peng, Zhinan
    Hu, Jiangping
    Zhao, Yiyi
    Luo, Rui
    Ghosh, Bijoy Kumar
    NEUROCOMPUTING, 2020, 413 : 85 - 95
  • [43] Adaptive Consensus Control for Nonlinear Multi-agent Systems with Unknown Control Directions
    Wang, Qingling
    Zheng, Yajun
    Sun, Changyin
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 5871 - 5874
  • [44] Consensus of multi-agent systems with unknown control directions by uniting dynamic and switching adaptive feedback
    Yu, Linzhen
    Liu, Yungang
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 5136 - 5141
  • [45] Deterministic Policy Gradient Based Formation Control for Multi-Agent Systems
    Hong, Zhiying
    Wang, Qingling
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 4349 - 4354
  • [46] Consensus control of multi-agent systems with delays
    Gong, Yi
    ELECTRONIC RESEARCH ARCHIVE, 2024, 32 (08): : 4887 - 4904
  • [47] Consensus control of multi-agent systems with a leader
    Gu, Jian-Zhong
    Yao, Jian-Ling
    Yang, Hong-Yong
    Complex Systems and Complexity Science, 2013, 10 (03) : 67 - 74
  • [48] Consensus for formation control of multi-agent systems
    Dong, Runsha
    Geng, Zhiyong
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2015, 25 (14) : 2481 - 2501
  • [49] Optimal bipartite consensus for discrete-time multi-agent systems with event-triggered mechanism based on adaptive dynamic programming
    Jin, Wanli
    Zhang, Huaguang
    Ming, Zhongyang
    NEUROCOMPUTING, 2024, 564
  • [50] Policy Gradient Adaptive Dynamic Programming for Model-Free Multi-Objective Optimal Control
    Zhang, Hao
    Li, Yan
    Wang, Zhuping
    Ding, Yi
    Yan, Huaicheng
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (04) : 1060 - 1062