Neural Q-Learning Based on Residual Gradient for Nonlinear Control Systems

被引：0

作者：

Si, Yanna ^{[1
]}

Pu, Jiexin ^{[1
]}

Zang, Shaofei ^{[1
]}

机构：

[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang, Peoples R China

来源：

ICCAIS 2019: THE 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES | 2019年

关键词：

Q-learning; feedforward neural network; value function approximation; residual gradient method; nonlinear control systems;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To solve the control problem of nonlinear system under continuous state space, this paper puts forward a neural Q-learning algorithm based on residual gradient method. Firstly, the multi-layer feedforward neural network is utilized to approximate the Q-value function, overcoming the "dimensional disaster" in the classical reinforcement learning. Then based on the residual gradient method, a mini-batch gradient descent is implemented by the experience replay to update the neural network parameters, which can effectively reduce the iterations number and increase the learning speed. Moreover, the momentum optimization method is introduced to ensure the stability of the training process further and improve the convergence. In order to balance exploration and utilization better, epsilon-decreasing strategy replaces epsilon-greedy for action selection. The simulation results of CartPole control task show the correctness and effectiveness of the proposed algorithm.

引用

页数：5

共 50 条

[41] Adaptive Optimal Control via Continuous-Time Q-Learning for Unknown Nonlinear Affine Systems
Chen, Anthony Siming
Herrmann, Guido
2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 1007 - 1012
[42] Stabilizing value iteration Q-learning for online evolving control of discrete-time nonlinear systems
Zhao, Mingming
Wang, Ding
Qiao, Junfei
NONLINEAR DYNAMICS, 2024, 112 (11) : 9137 - 9153
[43] Constrained predictive control for consensus of nonlinear multi-agent systems by using game Q-learning
Wang, Yan
Xue, Huiwen
Wen, Jiwei
Liu, Jinfeng
Luan, Xiaoli
NONLINEAR DYNAMICS, 2024, : 11683 - 11700
[44] Adaptive Q-Learning Based Model-Free H∞ Control of Continuous-Time Nonlinear Systems: Theory and Application
Zhao, Jun
Lv, Yongfeng
Wang, Zhangu
Zhao, Ziliang
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
[45] Iterative learning control for nonlinear systems based, on neural networks
Zhan, XQ
Zhao, KD
Wu, SL
Wang, M
Hu, HZ
1997 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT PROCESSING SYSTEMS, VOLS 1 & 2, 1997, : 517 - 520
[46] Q-learning based on neural network in learning action selection of mobile robot
Qiao, Junfei
Hou, Zhanjun
Ruan, Xiaogang
2007 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS, VOLS 1-6, 2007, : 263 - 267
[47] LEARNING HOSE TRANSPORT CONTROL WITH Q-LEARNING
Fernandez-Gauna, Borja
Manuel Lopez-Guede, Jose
Zulueta, Ekaitz
Grana, Manuel
NEURAL NETWORK WORLD, 2010, 20 (07) : 913 - 923
[48] Zap Q-learning with Nonlinear Function Approximation
Chen, Shuhang
Devraj, Adithya M.
Lu, Fan
Busic, Ana
Meyn, Sean P.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[49] Region-based Q-Learning for intelligent robot systems
Suh, IH
Kim, JH
Oh, SR
1997 IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION - CIRA '97, PROCEEDINGS: TOWARDS NEW COMPUTATIONAL PRINCIPLES FOR ROBOTICS AND AUTOMATION, 1997, : 172 - 178
[50] Non-linear control based on Q-learning algorithms
Yang, Dong
Yin, Chang-Ming
Chen, Huan-Wen
Wu, Bo-Sen
Changsha Dianli Xueyuan Xuebao/Journal of Changsha University of Electric Power, 2003, 18 (01):

← 1 2 3 4 5 →