Tracking control of AUV via novel soft actor-critic and suboptimal demonstrations

被引:4
|
作者
Zhang, Yue [1 ]
Zhang, Tianze [1 ]
Li, Yibin [1 ]
Zhuang, Yinghao [1 ]
机构
[1] Shandong Univ, Inst Marine Sci & Technol, Qingdao 266237, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous underwater vehicle (AUV); Tracking control; Reinforcement learning (RL); Suboptimal demonstration; Soft actor-critic (SAC); Recurrent neural network (RNN);
D O I
10.1016/j.oceaneng.2023.116540
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
Tracking control for autonomous underwater vehicles (AUVs) faces multifaceted challenges, making the acquisition of optimal demonstrations a daunting task. The suboptimal demonstrations mean less tracking accuracy. To address the issue of learning from suboptimal demonstrations, this paper proposes a model-free reinforcement learning (RL) method. Our approach utilizes suboptimal demonstrations to obtain an initial controller, which is iteratively refined during training. Given the suboptimal characteristics, demonstrations will be removed from the replay buffer upon reaching capacity. Building upon the soft actor-critic (SAC), our approach integrates a Recurrent Neural Network (RNN) into the policy network to capture the relationship between states and actions. Moreover, we introduce logarithmic and cosine functions to the reward function for enhancing the training effectiveness. Finally, we validate the effectiveness of the proposed Initialize Controller from Demonstrations (ICfD) algorithm through simulations with two reference trajectories. We provide a definition for tracking success. The success rates of ICfD in the two reference trajectories are 95.60% and 94.05%, respectively, surpassing the state-of-the-art RL method SACfD (80.03% 90.55%). The average one-step distance errors of ICfD are 1.20 m and 0.76 m, respectively, significantly lower than the S-plane controller (9.725 m 8.325 m). Besides, we evaluate the generalization of the ICfD controller in different scenarios.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Multiagent Soft Actor-Critic for Traffic Light Timing
    Wu, Lan
    Wu, Yuanming
    Qiao, Cong
    Tian, Yafang
    JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2023, 149 (02)
  • [32] Hierarchical Multiagent Formation Control Scheme via Actor-Critic Learning
    Mu, Chaoxu
    Peng, Jiangwen
    Sun, Changyin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 8764 - 8777
  • [33] Regularized Soft Actor-Critic for Behavior Transfer Learning
    Tan, Mingxi
    Tian, Andong
    Denoyer, Ludovic
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 516 - 519
  • [34] PAC-Bayesian Soft Actor-Critic Learning
    Tasdighi, Bahareh
    Akgul, Abdullah
    Haussmann, Manuel
    Brink, Kenny Kazimirzak
    Kandemir, Melih
    SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 127 - 145
  • [35] Averaged Soft Actor-Critic for Deep Reinforcement Learning
    Ding, Feng
    Ma, Guanfeng
    Chen, Zhikui
    Gao, Jing
    Li, Peng
    COMPLEXITY, 2021, 2021
  • [36] Discretionary Lane-Change Decision and Control via Parameterized Soft Actor-Critic for Hybrid Action Space
    Lin, Yuan
    Liu, Xiao
    Zheng, Zishun
    MACHINES, 2024, 12 (04)
  • [37] Neural network based tracking control for an elastic joint robot with input constraint via actor-critic design
    Ouyang, Yuncheng
    Dong, Lu
    Wei, Yanling
    Sun, Changyin
    NEUROCOMPUTING, 2020, 409 : 286 - 295
  • [38] Optimal Elevator Group Control via Deep Asynchronous Actor-Critic Learning
    Wei, Qinglai
    Wang, Lingxiao
    Liu, Yu
    Polycarpou, Marios M.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5245 - 5256
  • [39] Suspension Control Strategies Using Switched Soft Actor-Critic Models or Real Roads
    Yong, Hwanmoo
    Seo, Joohwan
    Kim, Jaeyoung
    Kim, Myounghoe
    Choi, Jongeun
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (01) : 824 - 832
  • [40] Explicit coordinated signal control using soft actor-critic for cycle length determination
    Zhang, Kun
    Xu, Hongfeng
    Pan, Baofeng
    Zheng, Qiming
    IET INTELLIGENT TRANSPORT SYSTEMS, 2024, 18 (08) : 1396 - 1407