Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning

被引：103

作者：

Qin, Jiahu ^{[1
]}

Li, Man ^{[1
]}

Shi, Yang ^{[2
]}

Ma, Qichao ^{[1
]}

Zheng, Wei Xing ^{[3
]}

机构：

[1] Univ Sci & Technol China, Dept Automat, Hefei 230027, Anhui, Peoples R China

[2] Univ Victoria, Dept Mech Engn, Victoria, BC V8W 2Y2, Canada

[3] Western Sydney Univ, Sch Comp Engn & Math, Sydney, NSW 2751, Australia

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2019年 / 30卷 / 01期

基金：

澳大利亚研究理事会; 中国国家自然科学基金;

关键词：

Input saturation; multiagent systems; neural networks (NNs); off-policy reinforcement learning (RL); optimal synchronization control; LINEAR-SYSTEMS; NONLINEAR-SYSTEMS; NETWORKS; GAMES;

D O I：

10.1109/TNNLS.2018.2832025

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we aim to investigate the optimal synchronization problem for a group of generic linear systems with input saturation. To seek the optimal controller, Hamilton Jacobi-Bellman (HJB) equations involving nonquadratic input energy terms in coupled forms are established. The solutions to these coupled HJB equations are further proven to be optimal and the induced controllers constitute interactive Nash equilibrium. Due to the difficulty to analytically solve HJB equations, especially in coupled forms, and the possible lack of model information of the systems, we apply the data-based off-policy reinforcement learning algorithm to learn the optimal control policies. A byproduct of this off-policy algorithm is shown that it is insensitive to probing noise that is exerted to the system to maintain persistence of excitation condition. In order to implement this off-policy algorithm, we employ actor and critic neural networks to approximate the controllers and the cost functions. Furthermore, the estimated control policies obtained by this presented implementation are proven to converge to the optimal ones under certain conditions. Finally, an illustrative example is provided to verify the effectiveness of the proposed algorithm.

引用

页码：85 / 96

页数：12

共 50 条

[41] Off-Policy Differentiable Logic Reinforcement Learning
Zhang, Li
Li, Xin
Wang, Mingzhong
Tian, Andong
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 617 - 632
[42] Marginalized Operators for Off-policy Reinforcement Learning
Tang, Yunhao
Rowland, Mark
Munos, Remi
Valko, Michal
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 655 - 679
[43] Off-Policy Shaping Ensembles in Reinforcement Learning
Harutyunyan, Anna
Brys, Tim
Vrancx, Peter
Nowe, Ann
21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1021 - 1022
[44] Learning Routines for Effective Off-Policy Reinforcement Learning
Cetin, Edoardo
Celiktutan, Oya
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[45] Enhanced Strategies for Off-Policy Reinforcement Learning Algorithms in HVAC Control
Chen, Zhe
Jia, Qingshan
2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 1691 - 1696
[46] Off-policy Reinforcement Learning for Robust Control of Discrete-time Uncertain Linear Systems
Yang, Yongliang
Guo, Zhishan
Wunsch, Donald
Yin, Yixin
PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 2507 - 2512
[47] Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Xie, Tengyang
Ma, Yifei
Wang, Yu-Xiang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[48] Off-Policy Reinforcement Learning: Optimal Operational Control for Two-Time-Scale Industrial Processes
Li, Jinna
Kiumarsi, Bahare
Chai, Tianyou
Lewis, Frank L.
Fan, Jialu
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) : 4547 - 4558
[49] Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning
Zhang, Zenglian
Song, Ruizhuo
Cao, Min
NEUROCOMPUTING, 2019, 356 : 162 - 169
[50] Off-Policy Risk-Sensitive Reinforcement Learning-Based Constrained Robust Optimal Control
Li, Cong
Liu, Qingchen
Zhou, Zhehua
Buss, Martin
Liu, Fangzhou
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (04): : 2478 - 2491

← 1 2 3 4 5 →