Model-Free Guidance Method for Drones in Complex Environments Using Direct Policy Exploration and Optimization

被引:3
|
作者
Liu, Hongxun [1 ]
Suzuki, Satoshi [1 ]
机构
[1] Chiba Univ, Sch Sci & Engn, 1-33 Yayoi Cho,Inage Ku, Chiba 2638522, Japan
关键词
drones; reinforcement learning; policy optimization; model-free; traverse complex environments;
D O I
10.3390/drones7080514
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
In the past few decades, drones have become lighter, with longer hang times, and exhibit more agile performance. To maximize their capabilities during flights in complex environments, researchers have proposed various model-based perception, planning, and control methods aimed at decomposing the problem into modules and collaboratively accomplishing the task in a sequential manner. However, in practical environments, it is extremely difficult to model both the drones and their environments, with very few existing model-based methods. In this study, we propose a novel model-free reinforcement-learning-based method that can learn the optimal planning and control policy from experienced flight data. During the training phase, the policy considers the complete state of the drones and environmental information as inputs. It then self-optimizes based on a predefined reward function. In practical implementations, the policy takes inputs from onboard and external sensors and outputs optimal control commands to low-level velocity controllers in an end-to-end manner. By capitalizing on this property, the planning and control policy can be improved without the need for an accurate system model and can drive drones to traverse complex environments at high speeds. The policy was trained and tested in a simulator, as well as in real-world flight experiments, demonstrating its practical applicability. The results show that this model-free method can learn to fly effectively and that it holds great potential to handle different tasks and environments.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Model-free adaptive control optimization using a chaotic particle swarm approach
    Coelho, Leandro dos Santos
    Rodrigues Coelho, Antonio Augusto
    CHAOS SOLITONS & FRACTALS, 2009, 41 (04) : 2001 - 2009
  • [22] Composition optimization of PEEK/PEI blend using model-free kinetics analysis
    Ramani, R.
    Alam, S.
    THERMOCHIMICA ACTA, 2010, 511 (1-2) : 179 - 188
  • [23] Model-free reinforcement learning for motion planning of autonomous agents with complex tasks in partially observable environments
    Li, Junchao
    Cai, Mingyu
    Kan, Zhen
    Xiao, Shaoping
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (01)
  • [24] Model-free Control Design Using Policy Gradient Reinforcement Learning in LPV Framework
    Bao, Yajie
    Velni, Javad Mohammadpour
    2021 EUROPEAN CONTROL CONFERENCE (ECC), 2021, : 150 - 155
  • [25] Guidance Method to Allow a User Free Exploration with a Photorealistic View in 3D Reconstructed Virtual Environments
    Iwasaki, Sho
    Narumi, Takuji
    Tanikawa, Tomohiro
    Hirose, Michitaka
    DISTRIBUTED, AMBIENT AND PERVASIVE INTERACTIONS, DAPI 2017, 2017, 10291 : 347 - 357
  • [26] Mapping and exploration of complex environments using persistent 3D model
    Fournier, Jonathan
    Ricard, Benoit
    Laurendeau, Denis
    FOURTH CANADIAN CONFERENCE ON COMPUTER AND ROBOT VISION, PROCEEDINGS, 2007, : 403 - +
  • [27] A Model-Free H∞ Control Method Based on Off-Policy With Output Data Feedback
    Li Z.
    Fan J.-L.
    Jiang Y.
    Chai T.-Y.
    Fan, Jia-Lu (jlfan@mail.neu.edu.cn), 1600, Science Press (47): : 2182 - 2193
  • [28] A model-free direct synthesis method for PI/PID controller design based on disturbance rejection
    Jeng, Jyh-Cheng
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 147 : 14 - 29
  • [29] A Simple Model-Free Method for Direct Assessment of Fluorescent Ligand Binding by Linear Spectral Summation
    Gasymov, Oktay K.
    Abduragimov, Adil R.
    Glasgow, Ben J.
    JOURNAL OF FLUORESCENCE, 2014, 24 (01) : 231 - 238
  • [30] A Simple Model-Free Method for Direct Assessment of Fluorescent Ligand Binding by Linear Spectral Summation
    Oktay K. Gasymov
    Adil R. Abduragimov
    Ben J. Glasgow
    Journal of Fluorescence, 2014, 24 : 231 - 238