Reinforcement Learning-based path tracking for underactuated UUV under intermittent communication

被引:4
|
作者
Liu Z. [1 ]
Cai W. [1 ]
Zhang M. [2 ]
机构
[1] Hangzhou Dianzi University, 2nd Street, Zhejiang, Hangzhou
[2] Zhejiang University of Water Resources and Electric Power, 2nd Street, Zhejiang, Hangzhou
基金
中国国家自然科学基金;
关键词
Intermittent communication; Path control; Self-attention mechanism; Soft Actor and Critic (SAC); Unmanned Underwater Vehicle (UUV);
D O I
10.1016/j.oceaneng.2023.116076
中图分类号
学科分类号
摘要
This paper studies the path control of a six-degree-of-freedom underactuated Unmanned Underwater Vehicle (UUV) under limited communication conditions. Considering the large number of coupling between six-degree-of-freedom underactuated UUV of unknown dynamic models, traditional model-based control methods are difficult to effectively solve the three-dimensional path control problem. A self-attention based soft actor and critic (A-SAC) algorithm is designed to learn effective control policy from random paths. The problem of limited target acquisition by UUV in the actual underwater environment is effectively overcome, which is mainly caused by the inability of UUV to consistently receive information about their expected path. A new state space is designed and a self-attention mechanism is introduced to improve the efficiency of using discontinuous path information. Furthermore, the validation experiment compares classical Reinforcement Learning methods such as DDPG, PPO, and etc. Compared to other existing methods, the proposed A-SAC algorithm can more quickly and effectively learn the path control policy for a six-degree-of-freedom UUV that operates in a complex environment. © 2023 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [31] Reinforcement learning-based tracking control for AUVs subject to disturbances
    Wang, Guangcang
    Zhang, Dianfeng
    Wu, Zhaojing
    Proceedings of the 34th Chinese Control and Decision Conference, CCDC 2022, 2022, : 2222 - 2227
  • [32] Reinforcement learning-based optimal trajectory tracking control of surface vessels under input saturations
    Wei, Ziping
    Du, Jialu
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2023, 33 (06) : 3807 - 3825
  • [33] Reinforcement learning-based finite time control for the asymmetric underactuated tethered spacecraft with disturbances
    Lu, Yingbo
    Wang, Xingyu
    Liu, Ya
    Huang, Panfeng
    ACTA ASTRONAUTICA, 2024, 220 : 218 - 229
  • [34] A robot path tracking method based on manual guidance and path reinforcement learning
    Pan, Yong
    Chen, Chengjun
    Li, Dongnian
    Zhao, Zhengxu
    APPLIED INTELLIGENCE, 2025, 55 (02)
  • [35] Neural-Network-Based Reinforcement Learning Control for Path Following of Underactuated Ships
    Zhang Lixing
    Qiao Lei
    Chen Jianliang
    Zhang Weidong
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 5786 - 5791
  • [36] Path Planning for Underactuated Unmanned Surface Vehicle Swarm Based on Deep Reinforcement Learning
    Hou, Yuli
    Wang, Ning
    Qiu, Chidong
    PROCEEDINGS OF THE 36TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC 2024, 2024, : 409 - 414
  • [37] A deep reinforcement learning-based distributed connected automated vehicle control under communication failure
    Shi, Haotian
    Zhou, Yang
    Wang, Xin
    Fu, Sicheng
    Gong, Siyuan
    Ran, Bin
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2022, 37 (15) : 2033 - 2051
  • [38] A reinforcement learning-based communication topology in particle swarm optimization
    Yue Xu
    Dechang Pi
    Neural Computing and Applications, 2020, 32 : 10007 - 10032
  • [39] A reinforcement learning-based communication topology in particle swarm optimization
    Xu, Yue
    Pi, Dechang
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (14): : 10007 - 10032
  • [40] Deep Reinforcement Learning-based Scheduling for Roadside Communication Networks
    Atallah, Rihal
    Assi, Chadi
    Khahhaz, Maurice
    2017 15TH INTERNATIONAL SYMPOSIUM ON MODELING AND OPTIMIZATION IN MOBILE, AD HOC, AND WIRELESS NETWORKS (WIOPT), 2017,