Tracking control of AUV via novel soft actor-critic and suboptimal demonstrations

被引:4
|
作者
Zhang, Yue [1 ]
Zhang, Tianze [1 ]
Li, Yibin [1 ]
Zhuang, Yinghao [1 ]
机构
[1] Shandong Univ, Inst Marine Sci & Technol, Qingdao 266237, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous underwater vehicle (AUV); Tracking control; Reinforcement learning (RL); Suboptimal demonstration; Soft actor-critic (SAC); Recurrent neural network (RNN);
D O I
10.1016/j.oceaneng.2023.116540
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
Tracking control for autonomous underwater vehicles (AUVs) faces multifaceted challenges, making the acquisition of optimal demonstrations a daunting task. The suboptimal demonstrations mean less tracking accuracy. To address the issue of learning from suboptimal demonstrations, this paper proposes a model-free reinforcement learning (RL) method. Our approach utilizes suboptimal demonstrations to obtain an initial controller, which is iteratively refined during training. Given the suboptimal characteristics, demonstrations will be removed from the replay buffer upon reaching capacity. Building upon the soft actor-critic (SAC), our approach integrates a Recurrent Neural Network (RNN) into the policy network to capture the relationship between states and actions. Moreover, we introduce logarithmic and cosine functions to the reward function for enhancing the training effectiveness. Finally, we validate the effectiveness of the proposed Initialize Controller from Demonstrations (ICfD) algorithm through simulations with two reference trajectories. We provide a definition for tracking success. The success rates of ICfD in the two reference trajectories are 95.60% and 94.05%, respectively, surpassing the state-of-the-art RL method SACfD (80.03% 90.55%). The average one-step distance errors of ICfD are 1.20 m and 0.76 m, respectively, significantly lower than the S-plane controller (9.725 m 8.325 m). Besides, we evaluate the generalization of the ICfD controller in different scenarios.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning
    Hu, Jintao
    Wang, Fujie
    Li, Xing
    Qin, Yi
    Guo, Fang
    Jiang, Ming
    BIOMIMETICS, 2024, 9 (12)
  • [22] A Soft Actor-Critic Algorithm for Sequential Recommendation
    Hong, Hyejin
    Kimurn, Yusuke
    Hatano, Kenji
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT I, DEXA 2024, 2024, 14910 : 258 - 266
  • [23] SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION
    Gao, Xiaonan
    Wu, Ziyi
    Zhu, Xianchao
    Cai, Lei
    JOURNAL OF NONLINEAR FUNCTIONAL ANALYSIS, 2025, 2025
  • [24] Optimal Tracking Control for Robotic Manipulator using Actor-Critic Network
    Hu, Yong
    Cui, Lingguo
    Chai, Senchun
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 1556 - 1561
  • [25] A Novel Actor-Critic Motor Reinforcement Learning for Continuum Soft Robots
    Pantoja-Garcia, Luis
    Parra-Vega, Vicente
    Garcia-Rodriguez, Rodolfo
    Vazquez-Garcia, Carlos Ernesto
    ROBOTICS, 2023, 12 (05)
  • [26] Actor-Critic Model Predictive Control
    Romero, Angel
    Song, Yunlong
    Scaramuzza, Davide
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 14777 - 14784
  • [27] Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models
    Nematollahi, Iman
    Rosete-Beas, Erick
    Roefer, Adrian
    Welschehold, Tim
    Valada, Abhinav
    Burgard, Wolfram
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 8651 - 8657
  • [28] Quantization-Based Adaptive Actor-Critic Tracking Control With Tracking Error Constraints
    Fan, Quan-Yong
    Yang, Guang-Hong
    Ye, Dan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (04) : 970 - 980
  • [29] CONTROLLED SENSING AND ANOMALY DETECTION VIA SOFT ACTOR-CRITIC REINFORCEMENT LEARNING
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4198 - 4202
  • [30] ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
    Hsu, Kai-Chieh
    Nguyen, Duy P.
    Fisac, Jaime F.
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211