Research on Autonomous Manoeuvre Decision Making in Within-Visual-Range Aerial Two-Player Zero-Sum Games Based on Deep Reinforcement Learning

被引：0

作者：

Lu, Bo ^{[1
,2
]}

Ru, Le ^{[1
,2
]}

Hu, Shiguang ^{[1
,2
]}

Wang, Wenfei ^{[1
,2
]}

Xi, Hailong ^{[1
,2
]}

Zhao, Xiaolin ^{[1
,2
]}

机构：

[1] Air Force Engn Univ, Equipment Management & UAV Engn Coll, Xian 710051, Peoples R China

[2] Air Force Engn Univ, Natl Key Lab Unmanned Aerial Vehicle Technol, Xian 710051, Peoples R China

来源：

MATHEMATICS | 2024年 / 12卷 / 14期

关键词：

WVR; TZSG; deep reinforcement learning; Markov decision processes; decision making; AIR COMBAT; AIRCRAFT; SYSTEM;

D O I：

10.3390/math12142160

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

In recent years, with the accelerated development of technology towards automation and intelligence, autonomous decision-making capabilities in unmanned systems are poised to play a crucial role in contemporary aerial two-player zero-sum games (TZSGs). Deep reinforcement learning (DRL) methods enable agents to make autonomous manoeuvring decisions. This paper focuses on current mainstream DRL algorithms based on fundamental tactical manoeuvres, selecting a typical aerial TZSG scenario-within visual range (WVR) combat. We model the key elements influencing the game using a Markov decision process (MDP) and demonstrate the mathematical foundation for implementing DRL. Leveraging high-fidelity simulation software (Warsim v1.0), we design a prototypical close-range aerial combat scenario. Utilizing this environment, we train mainstream DRL algorithms and analyse the training outcomes. The effectiveness of these algorithms in enabling agents to manoeuvre in aerial TZSG autonomously is summarised, providing a foundational basis for further research.

引用

页数：16

共 26 条

[21] Online concurrent reinforcement learning algorithm to solve two-player zero-sum games for partially unknown nonlinear continuous-time systems
Yasini, Sholeh
Karimpour, Ali
Sistani, Mohammad-Bagher Naghibi
Modares, Hamidreza
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2015, 29 (04) : 473 - 493
[22] Research on Autonomous Decision-Making of UCAV Based on Deep Reinforcement Learning
Wang, Linxiang
Wei, Hongtao
2022 3RD INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC 2022), 2022, : 122 - 126
[23] GameVLM: A Decision-making Framework for Robotic Task Planning Based on Visual Language Models and Zero-sum Games
Mei, Aoran
Wang, Jianhua
Zhu, Guo-Niu
Gan, Zhongxue
2024 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, ICMA 2024, 2024, : 1771 - 1776
[24] Policy Iteration Q-Learning for Data-Based Two-Player Zero-Sum Game of Linear Discrete-Time Systems
Luo, Biao
Yang, Yin
Liu, Derong
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (07) : 3630 - 3640
[25] Data-based discrete-time two-player zero-sum delayed game via policy iteration Q-learning Method
Jiang, Zongyang
Zhang, Haiying
Xiao, Yu
NEUROCOMPUTING, 2025, 631
[26] Research on autonomous decision-making method for spacecraft in the mission of rendezvous and approaching to maneuvering target based on deep reinforcement learning
Huang, Cheng
Xing, Aijia
Zeng, Quanli
Xiong, Fangyu
ASIAN JOURNAL OF CONTROL, 2025,

← 1 2 3 →