Q-Learning Aided Intelligent Routing With Maximum Utility in Cognitive UAV Swarm for Emergency Communications

被引:22
|
作者
Zhang, Long [1 ,2 ]
Ma, Xiaozheng [3 ]
Zhuang, Zirui [4 ]
Xu, Haitao [5 ]
Sharma, Vishal [6 ]
Han, Zhu [7 ,8 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Mobile Commun Technol, Chongqing 400065, Peoples R China
[3] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[4] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[5] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[6] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT9 5BN, North Ireland
[7] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77004 USA
[8] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 446701, South Korea
基金
中国国家自然科学基金;
关键词
Emergency communications; UAV swarm; cognitive radio; intelligent routing; maximum utility; Q-learning; SPECTRUM ACCESS; NETWORKS; OPPORTUNITIES; INTEGRATION; CHALLENGES; DELAY;
D O I
10.1109/TVT.2022.3221538
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article studies the routing problem in a cognitive unmanned aerial vehicle (UAV) swarm (CU-SWARM), which employs the cognitive radio into a swarm of UAVs within a three-layer hierarchical aerial-ground integrated network architecture for emergency communications. In particular, the flexibly converged architecture utilizes a UAV swarm and a high-altitude platform to support aerial sensing and access, respectively, over the disaster-affected areas. We develop a Q-learning framework to achieve the intelligent routing to maximize the utility for CU-SWARM. To characterize the reward function, we take into account both the routing metric design and the candidate UAV selection optimization. The routing metric jointly captures the achievable rate and the residual energy of UAV. Besides, under the location, arc, and direction constraints, the circular sector is modeled by properly choosing the central angle and the acceptable signal-to-noise ratio for UAV to optimize the candidate UAV selection. With this setup, we further propose a low-complexity iterative algorithm using the dynamic learning rate to update Q-values during the training process for achieving a fast convergence speed. Simulation results are provided to assess the potential of the Q-learning framework of intelligent routing as well as to verify our overall iterative algorithm via the dynamic learning rate for training procedure. Our findings reveal that the proposed algorithm converges in a few number of iterations. Furthermore, the proposed algorithm can increase the accumulated rewards, and achieve significant performance gains, as compared to the benchmark schemes.
引用
收藏
页码:3707 / 3723
页数:17
相关论文
共 38 条
  • [31] Deep Q-Learning-Based Node Positioning for Throughput-Optimal Communications in Dynamic UAV Swarm Network
    Koushik, A. M.
    Hu, Fei
    Kumar, Sunil
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2019, 5 (03) : 554 - 566
  • [32] Intelligent Maximum Power Extraction Control for Wind Energy Conversion Systems Based on Online Q-learning with Function Approximation
    Wei, Chun
    Zhang, Zhe
    Qiao, Wei
    Qu, Liyan
    2014 IEEE ENERGY CONVERSION CONGRESS AND EXPOSITION (ECCE), 2014, : 4911 - 4916
  • [33] Q-Learning based Intelligent Multi-Objective Particle Swarm Optimization of Light Control for Traffic Urban Congestion Management
    El Hatri, Chaimae
    Boumhidi, Jaouad
    2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 794 - 799
  • [34] Joint Trajectory and Passive Beamforming Design for Intelligent Reflecting Surface-Aided UAV Communications: A Deep Reinforcement Learning Approach
    Wang, Liang
    Wang, Kezhi
    Pan, Cunhua
    Aslam, Nauman
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (11) : 6543 - 6553
  • [35] An Intersection-Based Routing Scheme Using Q-Learning in Vehicular Ad Hoc Networks for Traffic Management in the Intelligent Transportation System
    Khan, Muhammad Umair
    Hosseinzadeh, Mehdi
    Mosavi, Amir
    MATHEMATICS, 2022, 10 (20)
  • [36] Social-Aware Peer Selection for Energy Efficient D2D Communications in UAV-Assisted Networks: A Q-Learning Approach
    Nadeem, Aamir
    Ullah, Arif
    Choi, Wooyeol
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (05) : 1468 - 1472
  • [37] Intelligent-based multi-robot path planning inspired by improved classical Q-learning and improved particle swarm optimization with perturbed velocity
    Das, P. K.
    Behera, H. S.
    Panigrahi, B. K.
    ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2016, 19 (01): : 651 - 669
  • [38] Proficient link state routing in mobile ad hoc network-based deep Q-learning network optimized with chaotic bat swarm optimization algorithm
    Rahul, P.
    Kaarthick, B.
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2023, 36 (01)