Q-Learning Aided Intelligent Routing With Maximum Utility in Cognitive UAV Swarm for Emergency Communications

被引：22

作者：

Zhang, Long ^{[1
,2
]}

Ma, Xiaozheng ^{[3
]}

Zhuang, Zirui ^{[4
]}

Xu, Haitao ^{[5
]}

Sharma, Vishal ^{[6
]}

Han, Zhu ^{[7
,8
]}

机构：

[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China

[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Mobile Commun Technol, Chongqing 400065, Peoples R China

[3] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China

[4] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China

[5] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China

[6] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT9 5BN, North Ireland

[7] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77004 USA

[8] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 446701, South Korea

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2023年 / 72卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Emergency communications; UAV swarm; cognitive radio; intelligent routing; maximum utility; Q-learning; SPECTRUM ACCESS; NETWORKS; OPPORTUNITIES; INTEGRATION; CHALLENGES; DELAY;

D O I：

10.1109/TVT.2022.3221538

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This article studies the routing problem in a cognitive unmanned aerial vehicle (UAV) swarm (CU-SWARM), which employs the cognitive radio into a swarm of UAVs within a three-layer hierarchical aerial-ground integrated network architecture for emergency communications. In particular, the flexibly converged architecture utilizes a UAV swarm and a high-altitude platform to support aerial sensing and access, respectively, over the disaster-affected areas. We develop a Q-learning framework to achieve the intelligent routing to maximize the utility for CU-SWARM. To characterize the reward function, we take into account both the routing metric design and the candidate UAV selection optimization. The routing metric jointly captures the achievable rate and the residual energy of UAV. Besides, under the location, arc, and direction constraints, the circular sector is modeled by properly choosing the central angle and the acceptable signal-to-noise ratio for UAV to optimize the candidate UAV selection. With this setup, we further propose a low-complexity iterative algorithm using the dynamic learning rate to update Q-values during the training process for achieving a fast convergence speed. Simulation results are provided to assess the potential of the Q-learning framework of intelligent routing as well as to verify our overall iterative algorithm via the dynamic learning rate for training procedure. Our findings reveal that the proposed algorithm converges in a few number of iterations. Furthermore, the proposed algorithm can increase the accumulated rewards, and achieve significant performance gains, as compared to the benchmark schemes.

引用

页码：3707 / 3723

页数：17

共 38 条

[1] Multipath Stability Routing in Cognitive UAV Swarm for Emergency Communications: A Hypergraph Matching Approach
Ma, Xiaozheng
Wang, Yao
Zhang, Long
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
[2] Collaborate Q-learning Aided Load Balance in Satellites Communications
Zhao, Yutong
Yao, Haipeng
Qin, Zeyu
Mai, Tianle
IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 968 - 973
[3] Cooperative Reinforcement Learning Aided Dynamic Routing in UAV Swarm Networks
Wang, Zunliang
Yao, Haipeng
Mai, Tianle
Xiong, Zehui
Yu, F. Richard
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022,
[4] Adaptive UAV-Assisted Geographic Routing With Q-Learning in VANET
Jiang, Shanshan
Huang, Zhitong
Ji, Yuefeng
IEEE COMMUNICATIONS LETTERS, 2021, 25 (04) : 1358 - 1362
[5] Intelligent Trajectory Design in UAV-Aided Communications With Reinforcement Learning
Yin, Sixing
Zhao, Shuo
Zhao, Yifei
Yu, F. Richard
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (08) : 8227 - 8231
[6] Q-learning based Stepwise Routing Protocol for Multi-UAV Networks
Lim, Jae Won
Ko, Young-Bae
3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 307 - 309
[7] Q-LBR: Q-Learning Based Load Balancing Routing for UAV-Assisted VANET
Roh, Bong-Soo
Han, Myoung-Hun
Ham, Jae-Hyun
Kim, Ki-Il
SENSORS, 2020, 20 (19) : 1 - 17
[8] Multi-Agent Reinforcement Learning Aided Intelligent UAV Swarm for Target Tracking
Xia, Zhaoyue
Du, Jun
Wang, Jingjing
Jiang, Chunxiao
Ren, Yong
Li, Gang
Han, Zhu
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (01) : 931 - 945
[9] An adaptive Q-learning based particle swarm optimization for multi-UAV path planning
Tan L.
Zhang H.
Liu Y.
Yuan T.
Jiang X.
Shang Z.
Soft Computing, 2024, 28 (13-14) : 7931 - 7946
[10] Distributed trained Q-learning based AODV routing protocol for UAV ad hoc networks
Wang, Yukun
Sun, Chen
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ALGORITHMS, SOFTWARE ENGINEERING, AND NETWORK SECURITY, ASENS 2024, 2024, : 524 - 530

← 1 2 3 4 →