Research on Autonomous Manoeuvre Decision Making in Within-Visual-Range Aerial Two-Player Zero-Sum Games Based on Deep Reinforcement Learning

被引:0
|
作者
Lu, Bo [1 ,2 ]
Ru, Le [1 ,2 ]
Hu, Shiguang [1 ,2 ]
Wang, Wenfei [1 ,2 ]
Xi, Hailong [1 ,2 ]
Zhao, Xiaolin [1 ,2 ]
机构
[1] Air Force Engn Univ, Equipment Management & UAV Engn Coll, Xian 710051, Peoples R China
[2] Air Force Engn Univ, Natl Key Lab Unmanned Aerial Vehicle Technol, Xian 710051, Peoples R China
关键词
WVR; TZSG; deep reinforcement learning; Markov decision processes; decision making; AIR COMBAT; AIRCRAFT; SYSTEM;
D O I
10.3390/math12142160
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In recent years, with the accelerated development of technology towards automation and intelligence, autonomous decision-making capabilities in unmanned systems are poised to play a crucial role in contemporary aerial two-player zero-sum games (TZSGs). Deep reinforcement learning (DRL) methods enable agents to make autonomous manoeuvring decisions. This paper focuses on current mainstream DRL algorithms based on fundamental tactical manoeuvres, selecting a typical aerial TZSG scenario-within visual range (WVR) combat. We model the key elements influencing the game using a Markov decision process (MDP) and demonstrate the mathematical foundation for implementing DRL. Leveraging high-fidelity simulation software (Warsim v1.0), we design a prototypical close-range aerial combat scenario. Utilizing this environment, we train mainstream DRL algorithms and analyse the training outcomes. The effectiveness of these algorithms in enabling agents to manoeuvre in aerial TZSG autonomously is summarised, providing a foundational basis for further research.
引用
收藏
页数:16
相关论文
共 26 条
  • [1] Stochastic Two-Player Zero-Sum Learning Differential Games
    Liu, Mushuang
    Wan, Yan
    Lewis, Frank L.
    Lopez, Victor G.
    2019 IEEE 15TH INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA), 2019, : 1038 - 1043
  • [2] The Lagging Anchor Algorithm: Reinforcement Learning in Two-Player Zero-Sum Games with Imperfect Information
    Fredrik A. Dahl
    Machine Learning, 2002, 49 : 5 - 37
  • [3] The lagging anchor algorithm: Reinforcement learning in two-player zero-sum games with imperfect information
    Dahl, FA
    MACHINE LEARNING, 2002, 49 (01) : 5 - 37
  • [4] Reinforcement Learning Based Solution to Two-player Zero-sum Game Using Differentiator
    Guo, Xinxin
    Yan, Weisheng
    Cui, Peng
    Zhang, Shouxu
    2018 3RD IEEE INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (IEEE ICARM), 2018, : 708 - 713
  • [5] Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games
    Ling, Chun Kai
    Fang, Fei
    Kolterl, J. Zico
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6104 - 6111
  • [6] Solutions for zero-sum two-player games with noncompact decision sets and unbounded payoffs
    Feinberg, Eugene A.
    Kasyanov, Pavlo O.
    Zgurovsky, Michael Z.
    NAVAL RESEARCH LOGISTICS, 2023, 70 (05) : 493 - 506
  • [7] Improved saddle point prediction in stochastic two-player zero-sum games with a deep learning approach
    Wu, Dawen
    Lisser, Abdel
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [8] 2L2, a simple reinforcement learning scheme for two-player zero-sum Markov games
    Frenay, Bendit
    Saerens, Marco
    NEUROCOMPUTING, 2009, 72 (7-9) : 1494 - 1507
  • [9] Online Minimax Q Network Learning for Two-Player Zero-Sum Markov Games
    Zhu, Yuanheng
    Zhao, Dongbin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (03) : 1228 - 1241
  • [10] Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback
    Cai, Yang
    Luo, Haipeng
    Wei, Chen-Yu
    Zheng, Weiqiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,