Tracking control of AUV via novel soft actor-critic and suboptimal demonstrations

被引：4

作者：

Zhang, Yue ^{[1
]}

Zhang, Tianze ^{[1
]}

Li, Yibin ^{[1
]}

Zhuang, Yinghao ^{[1
]}

机构：

[1] Shandong Univ, Inst Marine Sci & Technol, Qingdao 266237, Shandong, Peoples R China

来源：

OCEAN ENGINEERING | 2024年 / 293卷

基金：

中国国家自然科学基金;

关键词：

Autonomous underwater vehicle (AUV); Tracking control; Reinforcement learning (RL); Suboptimal demonstration; Soft actor-critic (SAC); Recurrent neural network (RNN);

D O I：

10.1016/j.oceaneng.2023.116540

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

Tracking control for autonomous underwater vehicles (AUVs) faces multifaceted challenges, making the acquisition of optimal demonstrations a daunting task. The suboptimal demonstrations mean less tracking accuracy. To address the issue of learning from suboptimal demonstrations, this paper proposes a model-free reinforcement learning (RL) method. Our approach utilizes suboptimal demonstrations to obtain an initial controller, which is iteratively refined during training. Given the suboptimal characteristics, demonstrations will be removed from the replay buffer upon reaching capacity. Building upon the soft actor-critic (SAC), our approach integrates a Recurrent Neural Network (RNN) into the policy network to capture the relationship between states and actions. Moreover, we introduce logarithmic and cosine functions to the reward function for enhancing the training effectiveness. Finally, we validate the effectiveness of the proposed Initialize Controller from Demonstrations (ICfD) algorithm through simulations with two reference trajectories. We provide a definition for tracking success. The success rates of ICfD in the two reference trajectories are 95.60% and 94.05%, respectively, surpassing the state-of-the-art RL method SACfD (80.03% 90.55%). The average one-step distance errors of ICfD are 1.20 m and 0.76 m, respectively, significantly lower than the S-plane controller (9.725 m 8.325 m). Besides, we evaluate the generalization of the ICfD controller in different scenarios.

引用

页数：13

共 50 条

[31] Multiagent Soft Actor-Critic for Traffic Light Timing
Wu, Lan
Wu, Yuanming
Qiao, Cong
Tian, Yafang
JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2023, 149 (02)
[32] Hierarchical Multiagent Formation Control Scheme via Actor-Critic Learning
Mu, Chaoxu
Peng, Jiangwen
Sun, Changyin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 8764 - 8777
[33] Regularized Soft Actor-Critic for Behavior Transfer Learning
Tan, Mingxi
Tian, Andong
Denoyer, Ludovic
2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 516 - 519
[34] PAC-Bayesian Soft Actor-Critic Learning
Tasdighi, Bahareh
Akgul, Abdullah
Haussmann, Manuel
Brink, Kenny Kazimirzak
Kandemir, Melih
SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 127 - 145
[35] Averaged Soft Actor-Critic for Deep Reinforcement Learning
Ding, Feng
Ma, Guanfeng
Chen, Zhikui
Gao, Jing
Li, Peng
COMPLEXITY, 2021, 2021
[36] Discretionary Lane-Change Decision and Control via Parameterized Soft Actor-Critic for Hybrid Action Space
Lin, Yuan
Liu, Xiao
Zheng, Zishun
MACHINES, 2024, 12 (04)
[37] Neural network based tracking control for an elastic joint robot with input constraint via actor-critic design
Ouyang, Yuncheng
Dong, Lu
Wei, Yanling
Sun, Changyin
NEUROCOMPUTING, 2020, 409 : 286 - 295
[38] Optimal Elevator Group Control via Deep Asynchronous Actor-Critic Learning
Wei, Qinglai
Wang, Lingxiao
Liu, Yu
Polycarpou, Marios M.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5245 - 5256
[39] Suspension Control Strategies Using Switched Soft Actor-Critic Models or Real Roads
Yong, Hwanmoo
Seo, Joohwan
Kim, Jaeyoung
Kim, Myounghoe
Choi, Jongeun
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (01) : 824 - 832
[40] Explicit coordinated signal control using soft actor-critic for cycle length determination
Zhang, Kun
Xu, Hongfeng
Pan, Baofeng
Zheng, Qiming
IET INTELLIGENT TRANSPORT SYSTEMS, 2024, 18 (08) : 1396 - 1407

← 1 2 3 4 5 →