Multi-Task Deep Reinforcement Learning for Terahertz NOMA Resource Allocation With Hybrid Discrete and Continuous Actions

被引:2
|
作者
Hu, Zhifeng [1 ]
Han, Chong [1 ,2 ,3 ]
Deng, Yansha [4 ]
Wang, Xudong [5 ]
机构
[1] Shanghai Jiao Tong Univ, Terahertz Wireless Commun TWC Lab, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Engn, Shanghai 200240, Peoples R China
[3] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr CMIC, Shanghai 200240, Peoples R China
[4] Kings Coll London, Dept Engn, London WC2R 2LS, England
[5] Shanghai Jiao Tong Univ, Univ Michigan Shanghai Jiao Tong Univ UM SJTU Join, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Resource management; NOMA; Throughput; Terahertz communications; Wireless communication; Multitasking; Hybrid power systems; Deep reinforcement learning (DRL); non-orthogonal multiple access (NOMA); Terahertz (THz) networks; NONORTHOGONAL MULTIPLE-ACCESS; POWER ALLOCATION; JOINT POWER; MIMO-NOMA; SYSTEMS; NETWORKS; COMMUNICATION; INTERFERENCE; CHALLENGES; CAPACITY;
D O I
10.1109/TVT.2024.3381238
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Terahertz (THz) non-orthogonal multiple access (NOMA) networks have great potential for next-generation wireless communications, by providing promising ultra-high data rates and user fairness. In THz-NOMA networks, efficient and effective long-term beamforming-bandwidth-power (BBP) allocation is yet an open problem due to its non-deterministic polynomial-time hard (NP-hard) nature. In this article, the continuous property of power and sub-arrays ratios assignment and the discrete property of sub-bands allocation are carefully treated. In light of these attributes, an offline hybrid discrete and continuous actions (DISCO) multi-task deep reinforcement learning (DRL) algorithm is proposed to maximize the long-term throughput. Specifically, the deployment of multi-task learning enables the actor of DISCO to smartly integrate two state-of-the-art DRL algorithms, e.g., actor-critic (AC) that only selects discrete actions and deep deterministic policy gradient (DDPG) that only generates continuous actions. Rigorous theoretical derivations for the neural network design and backpropagation process are provided to tailor our proposed DISCO for the BBP problem. Compared to the benchmark no-learning and conventional DRL algorithms, DISCO enhances the network throughput, while achieving good fairness among users. Furthermore, DISCO consumes hundred-of-millisecond computational time, revealing the practicability of DISCO.
引用
收藏
页码:11647 / 11663
页数:17
相关论文
共 50 条
  • [41] Multi-Task Learning Resource Allocation in Federated Integrated Sensing and Communication Networks
    Liu, Xiangnan
    Zhang, Haijun
    Ren, Chao
    Li, Haojin
    Sun, Chen
    Leung, Victor C. M.
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (09) : 11612 - 11623
  • [42] Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces
    Fu, Haotian
    Tang, Hongyao
    Hao, Jianye
    Lei, Zihan
    Chen, Yingfeng
    Fan, Changjie
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2329 - 2335
  • [43] Edge Collaborative Task Scheduling and Resource Allocation Based on Deep Reinforcement Learning
    Chen, Tianjian
    Lyu, Zengwei
    Yuan, Xiaohui
    Wei, Zhenchun
    Shi, Lei
    Fan, Yuqi
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, PT III, 2022, 13473 : 598 - 606
  • [44] Multi-Asset Market Making via Multi-Task Deep Reinforcement Learning
    Haider, Abbas
    Hawe, Glenn, I
    Wang, Hui
    Scotney, Bryan
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 353 - 364
  • [45] Task Scheduling and Power Allocation in Multiuser Multiserver Vehicular Networks by NOMA and Deep Reinforcement Learning
    Cong, Yuliang
    Liu, Maiou
    Wang, Cong
    Sun, Shuxian
    Hu, Fengye
    Liu, Zhan
    Wang, Chaoying
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (13): : 23532 - 23543
  • [46] Multi-task Batch Reinforcement Learning with Metric Learning
    Li, Jiachen
    Quan Vuong
    Liu, Shuang
    Liu, Minghua
    Ciosek, Kamil
    Christensen, Henrik
    Su, Hao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [47] Multi-Task Reinforcement Learning with Soft Modularization
    Yang, Ruihan
    Xu, Huazhe
    Wu, Yi
    Wang, Xiaolong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [48] Adversarial Online Multi-Task Reinforcement Learning
    Nguyen, Quan
    Mehta, Nishant A.
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 1124 - 1165
  • [49] Resource Allocation in THz-NOMA-Enabled HAP Systems: A Deep Reinforcement Learning Approach
    Le, Mai
    Pham, Quoc-Viet
    Do, Quang Vinh
    Han, Zhu
    Hwang, Won-Joo
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (04) : 6808 - 6816
  • [50] Deep Reinforcement Learning for Radio Resource Allocation in NOMA-based Remote State Estimation
    Pang, Gaoyang
    Liu, Wanchun
    Li, Yonghui
    Vucetic, Branka
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 3059 - 3064