Trajectory Planning With Deep Reinforcement Learning in High-Level Action Spaces

被引:7
|
作者
Williams, Kyle R. [1 ]
Schlossman, Rachel [1 ]
Whitten, Daniel [1 ]
Ingram, Joe
Musuvathy, Srideep [1 ]
Pagan, James [1 ]
Williams, Kyle A. [1 ]
Green, Sam [2 ]
Patel, Anirudh [2 ]
Mazumdar, Anirban [3 ]
Parish, Julie [1 ]
机构
[1] Sandia Natl Labs, Albuquerque, CA 94551 USA
[2] Semiot Labs, Los Altos, CA 94022 USA
[3] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
Trajectory; Planning; Trajectory planning; Training; Reinforcement learning; Optimization; Aerodynamics; OPTIMIZATION;
D O I
10.1109/TAES.2022.3218496
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This article presents a technique for trajectory planning based on parameterized high-level actions. These high-level actions are subtrajectories that have variable shape and duration. The use of high-level actions can improve the performance of guidance algorithms. Specifically, we show how the use of high-level actions improves the performance of guidance policies that are generated via reinforcement learning (RL). RL has shown great promise for solving complex control, guidance, and coordination problems but can still suffer from long training times and poor performance. This work shows how the use of high-level actions reduces the required number of training steps and increases the path performance of an RL-trained guidance policy. We demonstrate the method on a space-shuttle guidance example. We show the proposed method increases the path performance (latitude range) by 18% compared with a baseline RL implementation. Similarly, we show the proposed method achieves steady state during training with approximately 75% fewer training steps. We also show how the guidance policy enables effective performance in an obstacle field. Finally, this article develops a loss function term for policy-gradient-based deep RL, which is analogous to an antiwindup mechanism in feedback control. We demonstrate that the inclusion of this term in the underlying optimization increases the average policy return in our numerical example.
引用
收藏
页码:2513 / 2529
页数:17
相关论文
共 50 条
  • [41] Intelligent land vehicle model transfer trajectory planning method of deep reinforcement learning
    Yu L.-L.
    Shao X.-Y.
    Long Z.-W.
    Wei Y.-D.
    Zhou K.-J.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2019, 36 (09): : 1409 - 1422
  • [42] A Deep Reinforcement Learning Algorithm for Trajectory Planning of Swarm UAV Fulfilling Wildfire Reconnaissance
    Demir, Kubilay
    Tumen, Vedat
    Kosunalp, Selahattin
    Iliev, Teodor
    ELECTRONICS, 2024, 13 (13)
  • [43] Trajectory Planning of UAV in Wireless Powered IoT System Based on Deep Reinforcement Learning
    Zhang, Jidong
    Yu, Yu
    Wang, Zhigang
    Ao, Shaopeng
    Tang, Jie
    Zhang, Xiuyin
    Wong, Kai-Kit
    2020 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC), 2020, : 645 - 650
  • [44] Trajectory planning for airborne radar in extended target tracking based on deep reinforcement learning
    Zhang, Hongyun
    Chen, Hui
    Zhang, Wenxu
    Zhang, Xindi
    DIGITAL SIGNAL PROCESSING, 2024, 153
  • [45] Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic
    Bai, Zhengwei
    Wei Shangguan
    Cai, Baigen
    Chai, Linguo
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8600 - 8605
  • [46] Learning high-level robotic soccer strategies from scratch through reinforcement learning
    Abreu, Miguel
    Reis, Luis Paulo
    Cardoso, Henrique Lopes
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC 2019), 2019, : 128 - 134
  • [47] High-Level Learning from Demonstration with Conceptual Spaces and Subspace Clustering
    Cubek, Richard
    Ertel, Wolfgang
    Palm, Guenther
    2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, : 2592 - 2597
  • [48] Network Planning with Deep Reinforcement Learning
    Zhu, Hang
    Gupta, Varun
    Ahuja, Satyajeet Singh
    Tian, Yuandong
    Zhang, Ying
    Jin, Xin
    SIGCOMM '21: PROCEEDINGS OF THE 2021 ACM SIGCOMM 2021 CONFERENCE, 2021, : 258 - 271
  • [49] 3-D Autonomous Entry Trajectory Planning via Hybrid Action Reinforcement Learning
    Peng, Gaoxiang
    Wang, Bo
    Liu, Lei
    Fan, Huijin
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2025, 61 (01) : 342 - 354
  • [50] Learning and Planning in Complex Action Spaces
    Hubert, Thomas
    Schrittwieser, Julian
    Antonoglou, Ioannis
    Barekatain, Mohammadamin
    Schmitt, Simon
    Silver, David
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139