Q-Learning Aided Intelligent Routing With Maximum Utility in Cognitive UAV Swarm for Emergency Communications

被引:22
|
作者
Zhang, Long [1 ,2 ]
Ma, Xiaozheng [3 ]
Zhuang, Zirui [4 ]
Xu, Haitao [5 ]
Sharma, Vishal [6 ]
Han, Zhu [7 ,8 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Mobile Commun Technol, Chongqing 400065, Peoples R China
[3] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[4] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[5] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[6] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT9 5BN, North Ireland
[7] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77004 USA
[8] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 446701, South Korea
基金
中国国家自然科学基金;
关键词
Emergency communications; UAV swarm; cognitive radio; intelligent routing; maximum utility; Q-learning; SPECTRUM ACCESS; NETWORKS; OPPORTUNITIES; INTEGRATION; CHALLENGES; DELAY;
D O I
10.1109/TVT.2022.3221538
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article studies the routing problem in a cognitive unmanned aerial vehicle (UAV) swarm (CU-SWARM), which employs the cognitive radio into a swarm of UAVs within a three-layer hierarchical aerial-ground integrated network architecture for emergency communications. In particular, the flexibly converged architecture utilizes a UAV swarm and a high-altitude platform to support aerial sensing and access, respectively, over the disaster-affected areas. We develop a Q-learning framework to achieve the intelligent routing to maximize the utility for CU-SWARM. To characterize the reward function, we take into account both the routing metric design and the candidate UAV selection optimization. The routing metric jointly captures the achievable rate and the residual energy of UAV. Besides, under the location, arc, and direction constraints, the circular sector is modeled by properly choosing the central angle and the acceptable signal-to-noise ratio for UAV to optimize the candidate UAV selection. With this setup, we further propose a low-complexity iterative algorithm using the dynamic learning rate to update Q-values during the training process for achieving a fast convergence speed. Simulation results are provided to assess the potential of the Q-learning framework of intelligent routing as well as to verify our overall iterative algorithm via the dynamic learning rate for training procedure. Our findings reveal that the proposed algorithm converges in a few number of iterations. Furthermore, the proposed algorithm can increase the accumulated rewards, and achieve significant performance gains, as compared to the benchmark schemes.
引用
收藏
页码:3707 / 3723
页数:17
相关论文
共 38 条
  • [21] Trust-Based Intelligent Routing Protocol with Q-Learning for Mission-Critical Wireless Sensor Networks
    Keum, DooHo
    Ko, Young-Bae
    SENSORS, 2022, 22 (11)
  • [22] A Q-Learning and Fuzzy Logic-Based Hierarchical Routing Scheme in the Intelligent Transportation System for Smart Cities
    Rahmani, Amir Masoud
    Naqvi, Rizwan Ali
    Yousefpoor, Efat
    Yousefpoor, Mohammad Sadegh
    Ahmed, Omed Hassan
    Hosseinzadeh, Mehdi
    Siddique, Kamran
    MATHEMATICS, 2022, 10 (22)
  • [23] Formation control of a mono-operated UAV fleet through ad-hoc communications: a Q-learning approach
    Zema, Nicola Roberto
    Quadri, Dominique
    Martin, Steven
    Shrit, Omar
    2019 16TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON SENSING, COMMUNICATION, AND NETWORKING (SECON), 2019,
  • [24] Enhanced Online Q-Learning Scheme for Resource Allocation with Maximum Utility and Fairness in Edge-IoT Networks
    AlQerm, Ismail
    Pan, Jianli
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (04): : 3074 - 3086
  • [25] Robust Q-learning for Fast And Optimal Flying Base Station Placement Aided By Digital Twin For Emergency Use
    Guo, Terry N.
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [26] ACO intelligent task scheduling algorithm based on Q-learning optimization in a multilayer cognitive radio platform
    Xie, Zongfu
    Liu, Jinjin
    Ji, Yawei
    Li, Wanwan
    Dong, Chunxiao
    Yang, Bin
    SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL, 2023,
  • [27] An Optimal QoS Multicast Routing Protocol in IoT Enabling Cognitive Radio MANETs: A Deep Q-Learning Approach
    Thong Nhat Tran
    Toan-Van Nguyen
    Shim, Kyusung
    An, Beongku
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 279 - 283
  • [28] Intelligent neighbor selection for efficient query routing in unstructured P2P networks using Q-learning
    Shoab, Mohammad
    Al Jubayrin, Saad
    APPLIED INTELLIGENCE, 2022, 52 (06) : 6306 - 6315
  • [29] Intelligent neighbor selection for efficient query routing in unstructured P2P networks using Q-learning
    Mohammad Shoab
    Saad Al Jubayrin
    Applied Intelligence, 2022, 52 : 6306 - 6315
  • [30] Securing UAV-to-Vehicle Communications: A Curiosity-Driven Deep Q-learning Network (C-DQN) Approach
    Fu, Fang
    Jiao, Qi
    Yu, F. Richard
    Zhang, Zhicai
    Du, Jianbo
    2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2021,