Autopilot parameter rapid tuning method based on deep reinforcement learning

被引:0
|
作者
Wan Q. [1 ]
Lu B. [2 ]
Zhao Y. [3 ]
Wen Q. [1 ]
机构
[1] School of Aerospace Engineering, Beijing Institute of Technology, Beijing
[2] Beijing Institute of Space Long March Vehicle, Beijing
[3] China Academy of Launch Vehicle Technology, Beijing
来源
Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics | 2022年 / 44卷 / 10期
关键词
autopilot; intelligent control; normalization; parameter tuning; reinforcement learning;
D O I
10.12305/j.issn.1001-506X.2022.10.23
中图分类号
学科分类号
摘要
Aiming at the problem of slow training speed and poor convergence of deep reinforcement learning method for the autopilot control parameters training, an intelligent training method that converts three-dimensional control parameters into one-dimensional design parameters is proposed with the three-loop autopilot pole placement method as the core. The intelligent control architecture of offline deep reinforcement learning training and online multi-layer perceptron neural network real-time calculation is constructed, which improves the efficiency and convergence of deep reinforcement learning algorithm and ensures the rapid online tuning of control parameters under the condition of large-scale flight state changes. Taking a typical reentry aircraft as an example, the deep reinforcement learning training and neural network deployment are accomplished. The simulation results show that the training efficiency of the simplified reinforcement learning action space is higher, and the tracking error of the controller to the control command is less than 1.2% by the proposed parameter rapid tuning method based on deep reinforcement learning. © 2022 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:3190 / 3199
页数:9
相关论文
共 32 条
  • [1] GARNELL P., Guided weapon control systems [M], (1980)
  • [2] WEN Q Q, XIA Q L, QI Z K., Pole placement design with open-loop crossover frequency constraint for three-loop autopilot [J], Systems Engineering and Electronics, 31, 2, pp. 420-423, (2009)
  • [3] SUN B C, QI Z K., Study of pole placement method for state feedback constrained autopilot design, Journal of System Simulation, 18, pp. 892-896, (2006)
  • [4] ZHU J J, QI Z K, XIA Q L., Pole assignment method for three-loop autopilot design, Journal of Projectiles, Rockets, Missiles and Guidance, 27, 4, pp. 8-12, (2007)
  • [5] WANG H, LIN D F, QI Z K., Design and analysis of missile three-loop autopilot with pseudo-angle of attack feedback, Systems Engineering and Electronics, 34, 1, pp. 129-135, (2012)
  • [6] ZENG X, ZHU Y W, YANG L, Et al., A guidance method for coplanar orbital interception based on reinforcement learning, Journal of Systems Engineering and Electronics, 32, 4, pp. 927-938, (2021)
  • [7] LI Y, QIU X H, LIU X D, Et al., Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs, Journal of Systems Engineering and Electronics, 31, 4, pp. 734-742, (2020)
  • [8] MA Y, CHANG T Q, FAN W H., A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning, Journal of Systems Engineering and Electronics, 32, 3, pp. 642-657, (2021)
  • [9] MIN F, FRANS C G., Collaborative multi-agent reinforcement learning based on experience propagation, Journal of Systems Engineering and Electronics, 24, 4, pp. 683-689, (2013)
  • [10] RICHARDS S, ANDREW G B., Reinforcement learning: an introduction, (2014)