Tracking control of AUV via novel soft actor-critic and suboptimal demonstrations

被引:4
|
作者
Zhang, Yue [1 ]
Zhang, Tianze [1 ]
Li, Yibin [1 ]
Zhuang, Yinghao [1 ]
机构
[1] Shandong Univ, Inst Marine Sci & Technol, Qingdao 266237, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous underwater vehicle (AUV); Tracking control; Reinforcement learning (RL); Suboptimal demonstration; Soft actor-critic (SAC); Recurrent neural network (RNN);
D O I
10.1016/j.oceaneng.2023.116540
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
Tracking control for autonomous underwater vehicles (AUVs) faces multifaceted challenges, making the acquisition of optimal demonstrations a daunting task. The suboptimal demonstrations mean less tracking accuracy. To address the issue of learning from suboptimal demonstrations, this paper proposes a model-free reinforcement learning (RL) method. Our approach utilizes suboptimal demonstrations to obtain an initial controller, which is iteratively refined during training. Given the suboptimal characteristics, demonstrations will be removed from the replay buffer upon reaching capacity. Building upon the soft actor-critic (SAC), our approach integrates a Recurrent Neural Network (RNN) into the policy network to capture the relationship between states and actions. Moreover, we introduce logarithmic and cosine functions to the reward function for enhancing the training effectiveness. Finally, we validate the effectiveness of the proposed Initialize Controller from Demonstrations (ICfD) algorithm through simulations with two reference trajectories. We provide a definition for tracking success. The success rates of ICfD in the two reference trajectories are 95.60% and 94.05%, respectively, surpassing the state-of-the-art RL method SACfD (80.03% 90.55%). The average one-step distance errors of ICfD are 1.20 m and 0.76 m, respectively, significantly lower than the S-plane controller (9.725 m 8.325 m). Besides, we evaluate the generalization of the ICfD controller in different scenarios.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Event-triggered receding horizon control via actor-critic design
    Lu Dong
    Xin Yuan
    Changyin Sun
    Science China Information Sciences, 2020, 63
  • [42] Research on Control Method of Electric Vehicle in Residential Area Based on Soft Actor-Critic
    Yu, Hang
    Dou, Xiaobo
    Hu, Wei
    Zhang, Kexin
    2023 5TH ASIA ENERGY AND ELECTRICAL ENGINEERING SYMPOSIUM, AEEES, 2023, : 1235 - 1240
  • [43] Soft-Robust Actor-Critic Policy-Gradient
    Derman, Esther
    Mankowitz, Daniel J.
    Mann, Timothy A.
    Mannor, Shie
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 208 - 218
  • [44] Generalizing Soft Actor-Critic Algorithms to Discrete Action Spaces
    Zhang, Le
    Gu, Yong
    Zhao, Xin
    Zhang, Yanshuo
    Zhao, Shu
    Jin, Yifei
    Wu, Xinxin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT 1, 2025, 15031 : 34 - 49
  • [45] Meta Soft Actor-Critic Based Robust Sequential Power Control in Vehicular Networks
    Liu, Zhihua
    Guo, Chongtao
    Guo, Cheng
    Liu, Zhaoyang
    Wang, Xijun
    2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [46] A Predictive Control Method Based on Neural Predictor and Soft Actor-Critic for Power Converters
    Liu, Chenghao
    Ma, Jien
    Liu, Xing
    Qiu, Lin
    Wu, Wenjie
    Fang, Youtong
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024,
  • [47] Event-triggered receding horizon control via actor-critic design
    Lu DONG
    Xin YUAN
    Changyin SUN
    Science China(Information Sciences), 2020, 63 (05) : 131 - 145
  • [48] Soft Actor-Critic Request Redirection for Quality Control in Green Multimedia Content Distribution
    Goudarzi, Pejman
    Lloret, Jaime
    TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2024, 35 (12):
  • [49] Event-triggered receding horizon control via actor-critic design
    Dong, Lu
    Yuan, Xin
    Sun, Changyin
    SCIENCE CHINA-INFORMATION SCIENCES, 2020, 63 (05)
  • [50] Energy-efficient train control method based on soft actor-critic algorithm
    Zhu, Q.
    Su, S.
    Tang, T.
    Xiao, X.
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2423 - 2428