Reinforcement learning method for supercritical airfoil aerodynamic design

被引:0
|
作者
Li R. [1 ]
Zhang Y. [1 ]
Chen H. [1 ]
机构
[1] School of Aerospace Engineering, Tsinghua University, Beijing
基金
中国国家自然科学基金;
关键词
Application transferability; Imitation learning; Incremental modification; Pretraining; Proximal Policy Optimization (PPO); Reinforcement learning;
D O I
10.7527/S1000-6893.2020.23810
中图分类号
学科分类号
摘要
Reinforcement learning as a machine learning method for learning policies learns in a way similar to human learning process, interacting with the environment and learning how to achieve more rewards. The elements and algorithms of reinforcement learning are defined and adjusted in this paper for the supercritical airfoil aerodynamic design process. The results of imitation learning are then studied, and the policies from the imitation learning are adopted in reinforcement learning. The influence of different pretraining processes is studied, and the final policies tested in other similar environments. The results show that pretraining can improve reinforcement learning efficiency and policy robustness. The final policies obtained in this study can also have satisfactory performance in other similar environments. © 2021, Beihang University Aerospace Knowledge Press. All right reserved.
引用
收藏
相关论文
共 21 条
  • [1] LI R Z, ZHANG Y F, CHEN H X., Evolution and development of "man-in-loop" in aerodynamic optimization design, Acta Aerodynamica Sinica, 35, 4, pp. 529-543, (2017)
  • [2] CHEN H X, DENG K W, LI R Z., Utilization of machine learning technology in aerodynamic optimization, Acta Aeronautica et Astronautica Sinica, 40, 1, (2019)
  • [3] SUTTON R S, BARTO A G., Reinforcement learning: An introduction, (2018)
  • [4] GARNIER P, VIQUERAT J, RABAULT J, Et al., A review on deep reinforcement learning for fluid mechanics, (2019)
  • [5] RABAULT J, KUCHTA M, JENSEN A, Et al., Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control, Journal of Fluid Mechanics, 865, pp. 281-302, (2019)
  • [6] NOVATI G, VERMA S, ALEXEEV D, Et al., Synchronisation through learning for two self-propelled swimmers, Bioinspiration & biomimetics, 12, 3, (2017)
  • [7] BUCCI M A, SEMERARO O, ALLAUZEN A, Et al., Control of chaotic systems by deep reinforcement learning, (2019)
  • [8] LAMPTON A, NIKSCH A, VALASEK J., Morphing airfoils with four morphing parameters: AIAA-2008-7282, (2008)
  • [9] KULFAN B M., Universal parametric geometry representation method, Journal of Aircraft, 45, 1, pp. 142-158, (2008)
  • [10] STRAATHOF M H, VAN TOOREN M J, VOSKUIJL M., Aerodynamic shape parameterisation and optimisation of novel configurations, Proceedings of the 2008 Royal Aeronautical Society Annual Applied Aerodynamics Research Conference, (2008)