Continuous interval type-2 fuzzy Q-learning algorithm for trajectory tracking tasks for vehicles

被引：2

作者：

Xuan, Chengbin ^{[1
]}

Lam, Hak-Keung ^{[1
]}

Shi, Qian ^{[1
]}

Chen, Ming ^{[1
]}

机构：

[1] Kings Coll London, Dept Engn, London WC2R 2LS, England

来源：

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL | 2022年 / 32卷 / 08期

关键词：

reinforcement learning; interval type-2 fuzzy system; vehicle automation; fuzzy Q-learning; fuzzy control;

D O I：

10.1002/rnc.6056

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Trajectory tracking is a fundamental but challenging task for vehicle automation. In addition to the system nonlinearity, the main difficulties in the trajectory tracking task are due to the environmental noise and the model uncertainties under different driving scenarios. Considering the uncertainties in the environment, the reinforcement learning method with continuous action and noise-resistance capability could be a promising way to overcome these issues. In this article, a novel continuous interval type-2 fuzzy Q-learning (CIT2FQL) algorithm is proposed to deal with the trajectory tracking task. By introducing the n-dimensional interval type-2 fuzzy inference system (n-D IT2FIS) in fuzzy Q-learning, our proposed method achieves the continuous Q-learning by combining the action interpolation with IT2FIS for the first time. We also proposed a simplified type-reduction method for n-D IT2FIS to improve the computing efficiency of the proposed method. Moreover, a radial basis function (RBF) layer is chosen as the basis function to achieve the q-value interpolation. Finally, a trajectory tracking task in a simulation environment is conducted to verify the effectiveness and robustness of the proposed method under different scenarios. The results demonstrate that the proposed method has better robustness and noise-resistance capability while maintaining good tracking performance compared with the state-of-the-art baseline algorithms including double deep Q network (DDQN), proximal policy optimization (PPO), and interval type-2 dynamic fuzzy Q-learning (IT2DFQL).

引用

页码：4788 / 4815

页数：28

共 50 条

[1] Self-Organizing Interval Type-2 Fuzzy Q-Learning For Reinforcement Fuzzy Control
Hsu, Chia-Hung
Juang, Chia-Feng
2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 2033 - 2038
[2] Optimal Trajectory Output Tracking Control with a Q-learning Algorithm
Vamvoudakis, Kyriakos G.
2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 5752 - 5757
[3] Trajectory tracking control for rotary steerable systems using interval type-2 fuzzy logic and reinforcement learning
Zhang, Chi
Zou, Wei
Cheng, Ningbo
Gao, Junshan
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2018, 355 (02): : 803 - 826
[4] HYBRID LEARNING ALGORITHM FOR INTERVAL TYPE-2 FUZZY LOGIC SYSTEMS
Mendez, G. M.
Leduc, L. A.
CONTROL AND INTELLIGENT SYSTEMS, 2006, 34 (03)
[5] Hybrid learning algorithm for interval type-2 fuzzy neural networks
Castro, Juan R.
Castillo, Oscar
Melin, Patricia
Rodriguez-Diaz, Antonio
GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 157 - 162
[6] Hybrid learning algorithm for interval type-2 fuzzy logic systems
Departamento de Ingeniería, Eléctrica y Electrónica, Instituto Tecnológico de Nuevo Léon, Mexico
不详
Control Intell Syst, 2006, 3 (206-215):
[7] A navigation method for mobile robots using interval type-2 fuzzy neural network fitting Q-learning in unknown environments
Yi, Zeren
Li, Guojin
Chen, Shuang
Xie, Wei
Xu, Bugong
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (01) : 1113 - 1121
[8] ENHANCEMENTS OF FUZZY Q-LEARNING ALGORITHM
Glowaty, Grzegorz
COMPUTER SCIENCE-AGH, 2005, 7 : 77 - 87
[9] Antiforgetting Incremental Learning Algorithm for Interval Type-2 Fuzzy Neural Network
Sun, Chenxuan
Han, Honggui
Wu, Xiaolong
Yang, Hongyan
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (04) : 1938 - 1950
[10] Interval type-2 fuzzy automata and Interval type-2 fuzzy grammar
Sharan, S.
Sharma, B. K.
Jacob, Kavikumar
JOURNAL OF APPLIED MATHEMATICS AND COMPUTING, 2022, 68 (03) : 1505 - 1526

← 1 2 3 4 5 →