Autonomous Trajectory Planning Method for Stratospheric Airship Regional Station-Keeping Based on Deep Reinforcement Learning

被引:2
|
作者
Liu, Sitong [1 ,2 ]
Zhou, Shuyu [1 ]
Miao, Jinggang [1 ,2 ]
Shang, Hai [1 ]
Cui, Yuxuan [1 ]
Lu, Ying [1 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
基金
国家重点研发计划;
关键词
trajectory planning; stratospheric airship; deep reinforcement learning; proximal policy optimization (PPO); regional station-keeping; VEHICLE;
D O I
10.3390/aerospace11090753
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
The stratospheric airship, as a near-space vehicle, is increasingly utilized in scientific exploration and Earth observation due to its long endurance and regional observation capabilities. However, due to the complex characteristics of the stratospheric wind field environment, trajectory planning for stratospheric airships is a significant challenge. Unlike lower atmospheric levels, the stratosphere presents a wind field characterized by significant variability in wind speed and direction, which can drastically affect the stability of the airship's trajectory. Recent advances in deep reinforcement learning (DRL) have presented promising avenues for trajectory planning. DRL algorithms have demonstrated the ability to learn complex control strategies autonomously by interacting with the environment. In particular, the proximal policy optimization (PPO) algorithm has shown effectiveness in continuous control tasks and is well suited to the non-linear, high-dimensional problem of trajectory planning in dynamic environments. This paper proposes a trajectory planning method for stratospheric airships based on the PPO algorithm. The primary contributions of this paper include establishing a continuous action space model for stratospheric airship motion; enabling more precise control and adjustments across a broader range of actions; integrating time-varying wind field data into the reinforcement learning environment; enhancing the policy network's adaptability and generalization to various environmental conditions; and enabling the algorithm to automatically adjust and optimize flight paths in real time using wind speed information, reducing the need for human intervention. Experimental results show that, within its wind resistance capability, the airship can achieve long-duration regional station-keeping, with a maximum station-keeping time ratio (STR) of up to 0.997.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] An Efficiently Convergent Deep Reinforcement Learning-Based Trajectory Planning Method for Manipulators in Dynamic Environments
    Zheng, Li
    Wang, YaHao
    Yang, Run
    Wu, Shaolei
    Guo, Rui
    Dong, Erbao
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2023, 107 (04)
  • [42] Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning
    Yu, Lingli
    Shao, Xuanya
    Wei, Yadong
    Zhou, Kaijun
    SENSORS, 2018, 18 (09)
  • [43] Nonlinear DOB-based explicit NMPC for station-keeping of a multi-vectored propeller airship with thrust saturation
    Wen, Y.
    Chen, L.
    Wang, Y.
    Sun, D.
    Duan, D.
    Liu, J.
    AERONAUTICAL JOURNAL, 2018, 122 (1257): : 1753 - 1774
  • [44] Station-keeping Control Method for GEO Satellite based on Relative Orbit Dynamics
    Yang Wenbo
    Li Shaoyuan
    Li Ning
    2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1682 - 1687
  • [45] Station-Keeping Control Law Design of Near-Space Airship with Wind Interferences Based on Control Allocation Strategy
    Di, Xiaoguang
    Yang, Yafei
    Han, Xue
    Ma, Kemao
    MECHANICAL AND AEROSPACE ENGINEERING, PTS 1-7, 2012, 110-116 : 4962 - 4969
  • [46] An Autonomous Path Planning Model for Unmanned Ships Based on Deep Reinforcement Learning
    Guo, Siyu
    Zhang, Xiuguo
    Zheng, Yisong
    Du, Yiquan
    SENSORS, 2020, 20 (02)
  • [47] Trajectory planning with minimum energy consumption for multi-target regions autonomous cruise of stratospheric airship in wind field
    Xiao L.
    Zhou P.
    Wu Y.
    Lin Q.
    Jing Y.
    Yu D.
    Aerospace Systems, 2023, 6 (03) : 521 - 529
  • [48] Deep Reinforcement Learning Based Optimal Trajectory Tracking Control of Autonomous Underwater Vehicle
    Yu, Runsheng
    Shi, Zhenyu
    Huang, Chaoxing
    Li, Tenglong
    Ma, Qiongxiong
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 4958 - 4965
  • [49] Intelligent land vehicle model transfer trajectory planning method of deep reinforcement learning
    Yu L.-L.
    Shao X.-Y.
    Long Z.-W.
    Wei Y.-D.
    Zhou K.-J.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2019, 36 (09): : 1409 - 1422
  • [50] A UAV Path Planning Method Based on Deep Reinforcement Learning
    Li, Yibing
    Zhang, Sitong
    Ye, Fang
    Jiang, Tao
    Li, Yingsong
    2020 IEEE USNC-CNC-URSI NORTH AMERICAN RADIO SCIENCE MEETING (JOINT WITH AP-S SYMPOSIUM), 2020, : 93 - 94