Joint UAV trajectory and communication design with heterogeneous multi-agent reinforcement learning

被引:0
|
作者
Xuanhan ZHOU [1 ]
Jun XIONG [1 ]
Haitao ZHAO [1 ]
Xiaoran LIU [1 ]
Baoquan REN [2 ]
Xiaochen ZHANG [1 ]
Jibo WEI [1 ]
Hao YIN [2 ]
机构
[1] College of Electronic Science and Technology,National University of Defense Technology
[2] Systems Engineering Institute,Academy of Military Sciences PLA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TN929.5 [移动通信]; TP18 [人工智能理论]; V279 [无人驾驶飞机];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ; 1111 ;
摘要
Unmanned aerial vehicles(UAVs) are recognized as effective means for delivering emergency communication services when terrestrial infrastructures are unavailable. This paper investigates a multiUAV-assisted communication system, where we jointly optimize UAVs’ trajectories, user association, and ground users(GUs)’ transmit power to maximize a defined fairness-weighted throughput metric. Owing to the dynamic nature of UAVs, this problem has to be solved in real time. However, the problem’s non-convex and combinatorial attributes pose challenges for conventional optimization-based algorithms, particularly in scenarios without central controllers. To address this issue, we propose a multi-agent deep reinforcement learning(MADRL) approach to provide distributed and online solutions. In contrast to previous MADRLbased methods considering only UAV agents, we model UAVs and GUs as heterogeneous agents sharing a common objective. Specifically, UAVs are tasked with optimizing their trajectories, while GUs are responsible for selecting a UAV for association and determining a transmit power level. To learn policies for these heterogeneous agents, we design a heterogeneous coordinated QMIX(HC-QMIX) algorithm to train local Q-networks in a centralized manner. With these well-trained local Q-networks, UAVs and GUs can make individual decisions based on their local observations. Extensive simulation results demonstrate that the proposed algorithm outperforms state-of-the-art benchmarks in terms of total throughput and system fairness.
引用
收藏
页码:225 / 245
页数:21
相关论文
共 50 条
  • [1] Joint UAV trajectory and communication design with heterogeneous multi-agent reinforcement learning
    Zhou, Xuanhan
    Xiong, Jun
    Zhao, Haitao
    Liu, Xiaoran
    Ren, Baoquan
    Zhang, Xiaochen
    Wei, Jibo
    Yin, Hao
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (03)
  • [2] Joint Trajectory and Communication Optimization for Heterogeneous Vehicles in Maritime SAR: Multi-Agent Reinforcement Learning
    Lei, Chengjia
    Wu, Shaohua
    Yang, Yi
    Xue, Jiayin
    Zhang, Qinyu
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) : 12328 - 12344
  • [3] Distributed Safe Multi-Agent Reinforcement Learning: Joint Design of THz-Enabled UAV Trajectory and Channel Allocation
    Termehchi, Atefeh
    Syed, Aisha
    Kennedy, William Sean
    Erol-Kantarci, Melike
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (10) : 14172 - 14186
  • [4] Multi-Agent Deep Reinforcement Learning for Trajectory Design and Power Allocation in Multi-UAV Networks
    Zhao, Nan
    Liu, Zehua
    Cheng, Yiqiang
    IEEE ACCESS, 2020, 8 : 139670 - 139679
  • [5] Symmetry-Augmented Multi-Agent Reinforcement Learning for Scalable UAV Trajectory Design and User Scheduling
    Zhou, Xuanhan
    Xiong, Jun
    Zhao, Haitao
    Yan, Chao
    Wei, Jibo
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 14127 - 14144
  • [6] Multi-Agent Deep Reinforcement Learning for Joint Decoupled User Association and Trajectory Design in Full-Duplex Multi-UAV Networks
    Dai, Chen
    Zhu, Kun
    Hossain, Ekram
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) : 6056 - 6070
  • [7] Joint optimization of communication and mission performance for multi-UAV collaboration network: A multi-agent reinforcement learning method
    He, Yuan
    Xie, Jun
    Hu, Guyu
    Liu, Yaqun
    Luo, Xijian
    AD HOC NETWORKS, 2024, 164
  • [8] Multi-Agent Deep Reinforcement Learning Based UAV Trajectory Optimization for Differentiated Services
    Ning, Zhaolong
    Yang, Yuxuan
    Wang, Xiaojie
    Song, Qingyang
    Guo, Lei
    Jamalipour, Abbas
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 5818 - 5834
  • [9] Cellular UAV-to-Device Communications: Trajectory Design and Mode Selection by Multi-Agent Deep Reinforcement Learning
    Wu, Fanyi
    Zhang, Hongliang
    Wu, Jianjun
    Song, Lingyang
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (07) : 4175 - 4189
  • [10] Joint Communication-Motion Planning for UAV Swarm against Jamming with Multi-Agent Deep Reinforcement Learning
    Guo, Zhenxin
    Liu, Yiming
    Wang, Yipeng
    Meng, Yue
    Liu, Baoling
    IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC, 2024,