Neural Q-Learning Based on Residual Gradient for Nonlinear Control Systems

被引：0

作者：

Si, Yanna ^{[1
]}

Pu, Jiexin ^{[1
]}

Zang, Shaofei ^{[1
]}

机构：

[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang, Peoples R China

来源：

ICCAIS 2019: THE 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES | 2019年

关键词：

Q-learning; feedforward neural network; value function approximation; residual gradient method; nonlinear control systems;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To solve the control problem of nonlinear system under continuous state space, this paper puts forward a neural Q-learning algorithm based on residual gradient method. Firstly, the multi-layer feedforward neural network is utilized to approximate the Q-value function, overcoming the "dimensional disaster" in the classical reinforcement learning. Then based on the residual gradient method, a mini-batch gradient descent is implemented by the experience replay to update the neural network parameters, which can effectively reduce the iterations number and increase the learning speed. Moreover, the momentum optimization method is introduced to ensure the stability of the training process further and improve the convergence. In order to balance exploration and utilization better, epsilon-decreasing strategy replaces epsilon-greedy for action selection. The simulation results of CartPole control task show the correctness and effectiveness of the proposed algorithm.

引用

页数：5

共 50 条

[31] Balance Control of Robot With CMAC Based Q-learning
Li Ming-ai
Jiao Li-fang
Qiao Jun-fei
Ruan Xiao-gang
2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 2668 - 2672
[32] Power Control Algorithm Based on Q-Learning in Femtocell
Li Y.
Tang Y.
Liu H.
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2019, 41 (11): : 2557 - 2564
[33] Optimal Control Inspired Q-Learning for Switched Linear Systems
Chen, Hua
Zheng, Linfang
Zhang, Wei
2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 4003 - 4010
[34] Q-learning of the storage function in Economic Nonlinear Model Predictive Control
Kordabad, Arash Bahari
Gros, Sebastien
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116
[35] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
Wei QingLai
Liu DeRong
SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (12) : 1 - 15
[36] Q-Learning Algorithm Module in Hybrid Artificial Neural Network Systems
Vitku, Jaroslav
Nahodil, Pavel
MODERN TRENDS AND TECHNIQUES IN COMPUTER SCIENCE (CSOC 2014), 2014, 285 : 117 - 127
[37] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
WEI QingLai
LIU DeRong
ScienceChina(InformationSciences), 2015, 58 (12) : 147 - 161
[38] Dynamic neural network control through fuzzy Q-learning algorithms
Deng, ZD
Kwok, DP
1997 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT PROCESSING SYSTEMS, VOLS 1 & 2, 1997, : 381 - 386
[39] Neural Q-learning control architectures for a wall-following behavior
Cicirelli, G
D'Orazio, T
Distante, A
IROS 2003: PROCEEDINGS OF THE 2003 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2003, : 680 - 685
[40] Q-learning based dynamic optimal CPS control methodology for interconnected power systems
College of Electric Power, South China University of Technology, Guangzhou 510640, China
不详
Zhongguo Dianji Gongcheng Xuebao, 2009, 19 (13-19):

← 1 2 3 4 5 →