Trajectory Planning With Deep Reinforcement Learning in High-Level Action Spaces

被引:7
|
作者
Williams, Kyle R. [1 ]
Schlossman, Rachel [1 ]
Whitten, Daniel [1 ]
Ingram, Joe
Musuvathy, Srideep [1 ]
Pagan, James [1 ]
Williams, Kyle A. [1 ]
Green, Sam [2 ]
Patel, Anirudh [2 ]
Mazumdar, Anirban [3 ]
Parish, Julie [1 ]
机构
[1] Sandia Natl Labs, Albuquerque, CA 94551 USA
[2] Semiot Labs, Los Altos, CA 94022 USA
[3] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
Trajectory; Planning; Trajectory planning; Training; Reinforcement learning; Optimization; Aerodynamics; OPTIMIZATION;
D O I
10.1109/TAES.2022.3218496
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This article presents a technique for trajectory planning based on parameterized high-level actions. These high-level actions are subtrajectories that have variable shape and duration. The use of high-level actions can improve the performance of guidance algorithms. Specifically, we show how the use of high-level actions improves the performance of guidance policies that are generated via reinforcement learning (RL). RL has shown great promise for solving complex control, guidance, and coordination problems but can still suffer from long training times and poor performance. This work shows how the use of high-level actions reduces the required number of training steps and increases the path performance of an RL-trained guidance policy. We demonstrate the method on a space-shuttle guidance example. We show the proposed method increases the path performance (latitude range) by 18% compared with a baseline RL implementation. Similarly, we show the proposed method achieves steady state during training with approximately 75% fewer training steps. We also show how the guidance policy enables effective performance in an obstacle field. Finally, this article develops a loss function term for policy-gradient-based deep RL, which is analogous to an antiwindup mechanism in feedback control. We demonstrate that the inclusion of this term in the underlying optimization increases the average policy return in our numerical example.
引用
收藏
页码:2513 / 2529
页数:17
相关论文
共 50 条
  • [21] Fast and slow curiosity for high-level exploration in reinforcement learning
    Bougie, Nicolas
    Ichise, Ryutaro
    APPLIED INTELLIGENCE, 2021, 51 (02) : 1086 - 1107
  • [22] Action Spaces in Deep Reinforcement Learning to Mimic Human Input Devices
    Pleines, Marco
    Zimmer, Frank
    Berges, Vincent-Pierre
    2019 IEEE CONFERENCE ON GAMES (COG), 2019,
  • [23] Deep reinforcement learning-based reactive trajectory planning method for UAVs
    Cao, Lijia
    Wang, Lin
    Liu, Yang
    Xu, Weihong
    Geng, Chuang
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2024, 238 (10) : 1018 - 1037
  • [24] Stratospheric airship trajectory planning in wind field using deep reinforcement learning
    Qi, Lele
    Yang, Xixiang
    Bai, Fangchao
    Deng, Xiaolong
    Pan, Yuelong
    ADVANCES IN SPACE RESEARCH, 2025, 75 (01) : 620 - 634
  • [25] AoI optimal UAV trajectory planning: A Deep Recurrent Reinforcement Learning Approach
    Wu, Mengjie
    Chi, Huijia
    Gan, Shuying
    Wang, Xijun
    Xu, Chao
    2021 IEEE 32ND ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2021,
  • [26] Optimizing Robotic Task Sequencing and Trajectory Planning on the Basis of Deep Reinforcement Learning
    Dong, Xiaoting
    Wan, Guangxi
    Zeng, Peng
    Song, Chunhe
    Cui, Shijie
    BIOMIMETICS, 2024, 9 (01)
  • [27] Short-Term Trajectory Planning in TORCS using Deep Reinforcement Learning
    Capo, Emilio
    Loiacono, Daniele
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2327 - 2334
  • [28] Deep reinforcement learning trajectory planning for vibration suppression via jerk control
    Park, Sung Gwan
    Rhim, Sungsoo
    2023 20TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS, UR, 2023, : 818 - 824
  • [29] Deep Reinforcement Learning for Real-Time Trajectory Planning in UAV Networks
    Li, Kai
    Ni, Wei
    Tovar, Eduardo
    Guizani, Mohsen
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 958 - 963
  • [30] Improving Safety in Deep Reinforcement Learning using Unsupervised Action Planning
    Hsu, Hao-Lun
    Huang, Qiuhua
    Ha, Sehoon
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 5567 - 5573