Decentralized Federated Reinforcement Learning for User-Centric Dynamic TFDD Control

被引:7
|
作者
Yin, Ziyan [1 ]
Wang, Zhe [2 ]
Li, Jun [1 ]
Ding, Ming [3 ]
Chen, Wen [4 ]
Jin, Shi [5 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[3] CSIRO, Data61, Sydney, NSW 2015, Australia
[4] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China
[5] Southeast Univ, Natl Mobile Commun Res Lab, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
Heuristic algorithms; Resource management; Quality of service; Time-frequency analysis; Interference; Fading channels; Dynamic scheduling; Dynamic TFDD; decentralized partially observable Markov decision process; federated learning; multi-agent reinforcement learning; resource allocation; NETWORKS; OPTIMIZATION; MANAGEMENT; ALLOCATION; SYSTEMS; 5G;
D O I
10.1109/JSTSP.2022.3221671
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The explosive growth of dynamic and heterogeneous data traffic brings great challenges for 5G and beyond mobile networks. To enhance the network capacity and reliability, we propose a learning-based dynamic time-frequency division duplexing (D-TFDD) scheme that adaptively allocates the uplink and downlink time-frequency resources of base stations (BSs) to meet the asymmetric and heterogeneous traffic demands while alleviating the inter-cell interference. We formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) that maximizes the long-term expected sum rate under the users' packet dropping ratio constraints. In order to jointly optimize the global resources in a decentralized manner, we propose a federated reinforcement learning (RL) algorithm named federated Wolpertinger deep deterministic policy gradient (FWDDPG) algorithm. The BSs decide their local time-frequency configurations through RL algorithms and achieve global training via exchanging local RL models with their neighbors under a decentralized federated learning framework. Specifically, to deal with the large-scale discrete action space of each BS, we adopt a DDPG-based algorithm to generate actions in a continuous space, and then utilize Wolpertinger policy to reduce the mapping errors from continuous action space back to discrete action space. Simulation results demonstrate the superiority of our proposed algorithm to the benchmark algorithms with respect to system sum rate.
引用
收藏
页码:40 / 53
页数:14
相关论文
共 50 条
  • [41] Blockchain-Empowered Federated Learning for Healthcare Metaverses: User-Centric Incentive Mechanism With Optimal Data Freshness
    Kang, Jiawen
    Wen, Jinbo
    Ye, Dongdong
    Lai, Bingkun
    Wu, Tianhao
    Xiong, Zehui
    Nie, Jiangtian
    Niyato, Dusit
    Zhang, Yang
    Xie, Shengli
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (01) : 348 - 362
  • [42] Decentralized User-Centric Scheduling with Low Rate Feedback for Mobile Small Cells
    Ni, Wei
    Collings, Iain B.
    Liu, Ren Ping
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2013, 12 (12) : 6106 - 6120
  • [43] Dynamic Traffic Congestion Pricing Mechanism with User-Centric Considerations
    Kim Thien Bui
    Vu Anh Huynh
    Frazzoli, Emilio
    2012 15TH INTERNATIONAL IEEE CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2012, : 147 - 154
  • [44] User-Centric Distributed Spectrum Sharing in Dynamic Network Architectures
    Shafigh, Alireza Shams
    Glisic, Savo
    Hossain, Ekram
    Lorenzo, Beatriz
    DaSilva, Luiz A.
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (01) : 15 - 28
  • [45] A user-centric approach to dynamic adaptation of reusable communication services
    Andrew A. Allen
    Fábio M. Costa
    Peter J. Clarke
    Personal and Ubiquitous Computing, 2016, 20 : 209 - 227
  • [46] A user-centric approach to dynamic adaptation of reusable communication services
    Allen, Andrew A.
    Costa, Fabio M.
    Clarke, Peter J.
    PERSONAL AND UBIQUITOUS COMPUTING, 2016, 20 (02) : 209 - 227
  • [47] Representation of User Satisfaction and Fairness Evaluation for User-centric Dynamic Spectrum Access
    Ha Nguyen Tran
    Hasegawa, Mikio
    Murata, Yoshitoshi
    Harada, Hiroshi
    2009 IEEE 20TH INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, 2009, : 838 - 842
  • [48] A User-Centric Adaptive Learning System for E-Learning 2.0
    Huang, Shiu-Li
    Shiu, Jung-Hung
    EDUCATIONAL TECHNOLOGY & SOCIETY, 2012, 15 (03): : 214 - 225
  • [49] Online Learning Framework based on user-centric access behavior
    Huang, Guohao
    Jiang, Hao
    Xie, Jing
    Zeng, Yuanyuan
    Yi, Shuwen
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 290 - 297
  • [50] An Online Learning Approach to Sequential User-Centric Selection Problems
    Chen, Junpu
    Xie, Hong
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6231 - 6238