Q-Learning Aided Intelligent Routing With Maximum Utility in Cognitive UAV Swarm for Emergency Communications

被引:22
|
作者
Zhang, Long [1 ,2 ]
Ma, Xiaozheng [3 ]
Zhuang, Zirui [4 ]
Xu, Haitao [5 ]
Sharma, Vishal [6 ]
Han, Zhu [7 ,8 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Mobile Commun Technol, Chongqing 400065, Peoples R China
[3] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[4] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[5] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[6] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT9 5BN, North Ireland
[7] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77004 USA
[8] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 446701, South Korea
基金
中国国家自然科学基金;
关键词
Emergency communications; UAV swarm; cognitive radio; intelligent routing; maximum utility; Q-learning; SPECTRUM ACCESS; NETWORKS; OPPORTUNITIES; INTEGRATION; CHALLENGES; DELAY;
D O I
10.1109/TVT.2022.3221538
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article studies the routing problem in a cognitive unmanned aerial vehicle (UAV) swarm (CU-SWARM), which employs the cognitive radio into a swarm of UAVs within a three-layer hierarchical aerial-ground integrated network architecture for emergency communications. In particular, the flexibly converged architecture utilizes a UAV swarm and a high-altitude platform to support aerial sensing and access, respectively, over the disaster-affected areas. We develop a Q-learning framework to achieve the intelligent routing to maximize the utility for CU-SWARM. To characterize the reward function, we take into account both the routing metric design and the candidate UAV selection optimization. The routing metric jointly captures the achievable rate and the residual energy of UAV. Besides, under the location, arc, and direction constraints, the circular sector is modeled by properly choosing the central angle and the acceptable signal-to-noise ratio for UAV to optimize the candidate UAV selection. With this setup, we further propose a low-complexity iterative algorithm using the dynamic learning rate to update Q-values during the training process for achieving a fast convergence speed. Simulation results are provided to assess the potential of the Q-learning framework of intelligent routing as well as to verify our overall iterative algorithm via the dynamic learning rate for training procedure. Our findings reveal that the proposed algorithm converges in a few number of iterations. Furthermore, the proposed algorithm can increase the accumulated rewards, and achieve significant performance gains, as compared to the benchmark schemes.
引用
收藏
页码:3707 / 3723
页数:17
相关论文
共 38 条
  • [1] Multipath Stability Routing in Cognitive UAV Swarm for Emergency Communications: A Hypergraph Matching Approach
    Ma, Xiaozheng
    Wang, Yao
    Zhang, Long
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [2] Collaborate Q-learning Aided Load Balance in Satellites Communications
    Zhao, Yutong
    Yao, Haipeng
    Qin, Zeyu
    Mai, Tianle
    IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 968 - 973
  • [3] Cooperative Reinforcement Learning Aided Dynamic Routing in UAV Swarm Networks
    Wang, Zunliang
    Yao, Haipeng
    Mai, Tianle
    Xiong, Zehui
    Yu, F. Richard
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022,
  • [4] Adaptive UAV-Assisted Geographic Routing With Q-Learning in VANET
    Jiang, Shanshan
    Huang, Zhitong
    Ji, Yuefeng
    IEEE COMMUNICATIONS LETTERS, 2021, 25 (04) : 1358 - 1362
  • [5] Intelligent Trajectory Design in UAV-Aided Communications With Reinforcement Learning
    Yin, Sixing
    Zhao, Shuo
    Zhao, Yifei
    Yu, F. Richard
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (08) : 8227 - 8231
  • [6] Q-learning based Stepwise Routing Protocol for Multi-UAV Networks
    Lim, Jae Won
    Ko, Young-Bae
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 307 - 309
  • [7] Q-LBR: Q-Learning Based Load Balancing Routing for UAV-Assisted VANET
    Roh, Bong-Soo
    Han, Myoung-Hun
    Ham, Jae-Hyun
    Kim, Ki-Il
    SENSORS, 2020, 20 (19) : 1 - 17
  • [8] Multi-Agent Reinforcement Learning Aided Intelligent UAV Swarm for Target Tracking
    Xia, Zhaoyue
    Du, Jun
    Wang, Jingjing
    Jiang, Chunxiao
    Ren, Yong
    Li, Gang
    Han, Zhu
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (01) : 931 - 945
  • [9] An adaptive Q-learning based particle swarm optimization for multi-UAV path planning
    Tan L.
    Zhang H.
    Liu Y.
    Yuan T.
    Jiang X.
    Shang Z.
    Soft Computing, 2024, 28 (13-14) : 7931 - 7946
  • [10] Distributed trained Q-learning based AODV routing protocol for UAV ad hoc networks
    Wang, Yukun
    Sun, Chen
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ALGORITHMS, SOFTWARE ENGINEERING, AND NETWORK SECURITY, ASENS 2024, 2024, : 524 - 530