A Deep Reinforcement Learning-Based Intelligent Maneuvering Strategy for the High-Speed UAV Pursuit-Evasion Game

被引：1

作者：

Yan, Tian ^{[1
,2
,3
]}

Liu, Can ^{[1
]}

Gao, Mengjing ^{[1
]}

Jiang, Zijian ^{[1
]}

Li, Tong ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Unmanned Syst Res Inst, Xian 710072, Peoples R China

[2] Northwestern Polytech Univ, Natl Key Lab Unmanned Aerial Vehicle Technol, Xian 710072, Peoples R China

[3] Northwestern Polytech Univ, Integrated Res & Dev Platform Unmanned Aerial Vehi, Xian 710072, Peoples R China

来源：

DRONES | 2024年 / 8卷 / 07期

基金：

中国国家自然科学基金;

关键词：

pursuit-evasion game; line-of-sight angle rate; high-speed UAV; deep reinforcement learning; PROPORTIONAL NAVIGATION;

D O I：

10.3390/drones8070309

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Given the rapid advancements in kinetic pursuit technology, this paper introduces an innovative maneuvering strategy, denoted as LSRC-TD3, which integrates line-of-sight (LOS) angle rate correction with deep reinforcement learning (DRL) for high-speed unmanned aerial vehicle (UAV) pursuit-evasion (PE) game scenarios, with the aim of effectively evading high-speed and high-dynamic pursuers. In the challenging situations of the game, where both speed and maximum available overload are at a disadvantage, the playing field of UAVs is severely compressed, and the difficulty of evasion is significantly increased, placing higher demands on the strategy and timing of maneuvering to change orbit. While considering evasion, trajectory constraint, and energy consumption, we formulated the reward function by combining "terminal" and "process" rewards, as well as "strong" and "weak" incentive guidance to reduce pre-exploration difficulty and accelerate convergence of the game network. Additionally, this paper presents a correction factor for LOS angle rate into the double-delay deterministic gradient strategy (TD3), thereby enhancing the sensitivity of high-speed UAVs to changes in LOS rate, as well as the accuracy of evasion timing, which improves the effectiveness and adaptive capability of the intelligent maneuvering strategy. The Monte Carlo simulation results demonstrate that the proposed method achieves a high level of evasion performance-integrating energy optimization with the requisite miss distance for high-speed UAVs-and accomplishes efficient evasion under highly challenging PE game scenarios.

引用

页数：20

共 50 条

[41] Distance Information Based Pursuit-evasion Strategy: Continuous Stochastic Game With Belief State
Chen L.-M.
Feng Y.
Li Y.-Q.
Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (04): : 828 - 840
[42] An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning
Wan, Kaifang
Wu, Dingwei
Zhai, Yiwei
Li, Bo
Gao, Xiaoguang
Hu, Zijian
ENTROPY, 2021, 23 (11)
[43] Deep Reinforcement Learning-Based Intelligent Security Forwarding Strategy for VANET
Liu, Boya
Xu, Guoai
Xu, Guosheng
Wang, Chenyu
Zuo, Peiliang
SENSORS, 2023, 23 (03)
[44] Optimal game theoretic solution of the pursuit-evasion intercept problem using on-policy reinforcement learning
Kartal, Yusuf
Subbarao, Kamesh
Dogan, Atilla
Lewis, Frank
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (16) : 7886 - 7903
[45] Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment
Hu, Penglin
Pan, Quan
Zhao, Chunhui
Guo, Yaning
ASIAN JOURNAL OF CONTROL, 2024, 26 (04) : 2125 - 2140
[46] Pursuit-evasion game with online planning using deep reinforcement learningPursuit-evasion game with online planning using deep reinforcement...Y. Chen et al.
Yong Chen
Yu Shi
Xunhua Dai
Qing Meng
Tao Yu
Applied Intelligence, 2025, 55 (7)
[47] Deep Reinforcement Learning Based Active Pantograph Control Strategy in High-Speed Railway
Wang, Hui
Han, Zhiwei
Liu, Zhigang
Wu, Yanbo
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (01) : 227 - 238
[48] Optimization of resource allocation strategy for high-speed railway based on deep reinforcement learning
Gao, Xu
Zhao, Junhui
Zhang, Qingmiao
Han, Haitao
PHYSICAL COMMUNICATION, 2024, 66
[49] Deep Reinforcement Learning-Based Wind Disturbance Rejection Control Strategy for UAV
Ma, Qun
Wu, Yibo
Shoukat, Muhammad Usman
Yan, Yukai
Wang, Jun
Yang, Long
Yan, Fuwu
Yan, Lirong
DRONES, 2024, 8 (11)
[50] Apollonius Partitions Based Pursuit-evasion Game Strategies by Q-Learning Approach
Wang, Qing
Wu, KaiQi
Ye, JianFeng
Wu, YongBao
Xue, Lei
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4843 - 4848

← 1 2 3 4 5 →