Tracking control of AUV via novel soft actor-critic and suboptimal demonstrations

被引：4

作者：

Zhang, Yue ^{[1
]}

Zhang, Tianze ^{[1
]}

Li, Yibin ^{[1
]}

Zhuang, Yinghao ^{[1
]}

机构：

[1] Shandong Univ, Inst Marine Sci & Technol, Qingdao 266237, Shandong, Peoples R China

来源：

OCEAN ENGINEERING | 2024年 / 293卷

基金：

中国国家自然科学基金;

关键词：

Autonomous underwater vehicle (AUV); Tracking control; Reinforcement learning (RL); Suboptimal demonstration; Soft actor-critic (SAC); Recurrent neural network (RNN);

D O I：

10.1016/j.oceaneng.2023.116540

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

Tracking control for autonomous underwater vehicles (AUVs) faces multifaceted challenges, making the acquisition of optimal demonstrations a daunting task. The suboptimal demonstrations mean less tracking accuracy. To address the issue of learning from suboptimal demonstrations, this paper proposes a model-free reinforcement learning (RL) method. Our approach utilizes suboptimal demonstrations to obtain an initial controller, which is iteratively refined during training. Given the suboptimal characteristics, demonstrations will be removed from the replay buffer upon reaching capacity. Building upon the soft actor-critic (SAC), our approach integrates a Recurrent Neural Network (RNN) into the policy network to capture the relationship between states and actions. Moreover, we introduce logarithmic and cosine functions to the reward function for enhancing the training effectiveness. Finally, we validate the effectiveness of the proposed Initialize Controller from Demonstrations (ICfD) algorithm through simulations with two reference trajectories. We provide a definition for tracking success. The success rates of ICfD in the two reference trajectories are 95.60% and 94.05%, respectively, surpassing the state-of-the-art RL method SACfD (80.03% 90.55%). The average one-step distance errors of ICfD are 1.20 m and 0.76 m, respectively, significantly lower than the S-plane controller (9.725 m 8.325 m). Besides, we evaluate the generalization of the ICfD controller in different scenarios.

引用

页数：13

共 50 条

[21] Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning
Hu, Jintao
Wang, Fujie
Li, Xing
Qin, Yi
Guo, Fang
Jiang, Ming
BIOMIMETICS, 2024, 9 (12)
[22] A Soft Actor-Critic Algorithm for Sequential Recommendation
Hong, Hyejin
Kimurn, Yusuke
Hatano, Kenji
DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT I, DEXA 2024, 2024, 14910 : 258 - 266
[23] SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION
Gao, Xiaonan
Wu, Ziyi
Zhu, Xianchao
Cai, Lei
JOURNAL OF NONLINEAR FUNCTIONAL ANALYSIS, 2025, 2025
[24] Optimal Tracking Control for Robotic Manipulator using Actor-Critic Network
Hu, Yong
Cui, Lingguo
Chai, Senchun
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 1556 - 1561
[25] A Novel Actor-Critic Motor Reinforcement Learning for Continuum Soft Robots
Pantoja-Garcia, Luis
Parra-Vega, Vicente
Garcia-Rodriguez, Rodolfo
Vazquez-Garcia, Carlos Ernesto
ROBOTICS, 2023, 12 (05)
[26] Actor-Critic Model Predictive Control
Romero, Angel
Song, Yunlong
Scaramuzza, Davide
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 14777 - 14784
[27] Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models
Nematollahi, Iman
Rosete-Beas, Erick
Roefer, Adrian
Welschehold, Tim
Valada, Abhinav
Burgard, Wolfram
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 8651 - 8657
[28] Quantization-Based Adaptive Actor-Critic Tracking Control With Tracking Error Constraints
Fan, Quan-Yong
Yang, Guang-Hong
Ye, Dan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (04) : 970 - 980
[29] CONTROLLED SENSING AND ANOMALY DETECTION VIA SOFT ACTOR-CRITIC REINFORCEMENT LEARNING
Zhong, Chen
Gursoy, M. Cenk
Velipasalar, Senem
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4198 - 4202
[30] ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
Hsu, Kai-Chieh
Nguyen, Duy P.
Fisac, Jaime F.
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211

← 1 2 3 4 5 →