Decentralized Federated Reinforcement Learning for User-Centric Dynamic TFDD Control

被引：7

作者：

Yin, Ziyan ^{[1
]}

Wang, Zhe ^{[2
]}

Li, Jun ^{[1
]}

Ding, Ming ^{[3
]}

Chen, Wen ^{[4
]}

Jin, Shi ^{[5
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China

[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[3] CSIRO, Data61, Sydney, NSW 2015, Australia

[4] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China

[5] Southeast Univ, Natl Mobile Commun Res Lab, Nanjing 210096, Peoples R China

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2023年 / 17卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Heuristic algorithms; Resource management; Quality of service; Time-frequency analysis; Interference; Fading channels; Dynamic scheduling; Dynamic TFDD; decentralized partially observable Markov decision process; federated learning; multi-agent reinforcement learning; resource allocation; NETWORKS; OPTIMIZATION; MANAGEMENT; ALLOCATION; SYSTEMS; 5G;

D O I：

10.1109/JSTSP.2022.3221671

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The explosive growth of dynamic and heterogeneous data traffic brings great challenges for 5G and beyond mobile networks. To enhance the network capacity and reliability, we propose a learning-based dynamic time-frequency division duplexing (D-TFDD) scheme that adaptively allocates the uplink and downlink time-frequency resources of base stations (BSs) to meet the asymmetric and heterogeneous traffic demands while alleviating the inter-cell interference. We formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) that maximizes the long-term expected sum rate under the users' packet dropping ratio constraints. In order to jointly optimize the global resources in a decentralized manner, we propose a federated reinforcement learning (RL) algorithm named federated Wolpertinger deep deterministic policy gradient (FWDDPG) algorithm. The BSs decide their local time-frequency configurations through RL algorithms and achieve global training via exchanging local RL models with their neighbors under a decentralized federated learning framework. Specifically, to deal with the large-scale discrete action space of each BS, we adopt a DDPG-based algorithm to generate actions in a continuous space, and then utilize Wolpertinger policy to reduce the mapping errors from continuous action space back to discrete action space. Simulation results demonstrate the superiority of our proposed algorithm to the benchmark algorithms with respect to system sum rate.

引用

页码：40 / 53

页数：14

共 50 条

[41] Blockchain-Empowered Federated Learning for Healthcare Metaverses: User-Centric Incentive Mechanism With Optimal Data Freshness
Kang, Jiawen
Wen, Jinbo
Ye, Dongdong
Lai, Bingkun
Wu, Tianhao
Xiong, Zehui
Nie, Jiangtian
Niyato, Dusit
Zhang, Yang
Xie, Shengli
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (01) : 348 - 362
[42] Decentralized User-Centric Scheduling with Low Rate Feedback for Mobile Small Cells
Ni, Wei
Collings, Iain B.
Liu, Ren Ping
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2013, 12 (12) : 6106 - 6120
[43] Dynamic Traffic Congestion Pricing Mechanism with User-Centric Considerations
Kim Thien Bui
Vu Anh Huynh
Frazzoli, Emilio
2012 15TH INTERNATIONAL IEEE CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2012, : 147 - 154
[44] User-Centric Distributed Spectrum Sharing in Dynamic Network Architectures
Shafigh, Alireza Shams
Glisic, Savo
Hossain, Ekram
Lorenzo, Beatriz
DaSilva, Luiz A.
IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (01) : 15 - 28
[45] A user-centric approach to dynamic adaptation of reusable communication services
Andrew A. Allen
Fábio M. Costa
Peter J. Clarke
Personal and Ubiquitous Computing, 2016, 20 : 209 - 227
[46] A user-centric approach to dynamic adaptation of reusable communication services
Allen, Andrew A.
Costa, Fabio M.
Clarke, Peter J.
PERSONAL AND UBIQUITOUS COMPUTING, 2016, 20 (02) : 209 - 227
[47] Representation of User Satisfaction and Fairness Evaluation for User-centric Dynamic Spectrum Access
Ha Nguyen Tran
Hasegawa, Mikio
Murata, Yoshitoshi
Harada, Hiroshi
2009 IEEE 20TH INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, 2009, : 838 - 842
[48] A User-Centric Adaptive Learning System for E-Learning 2.0
Huang, Shiu-Li
Shiu, Jung-Hung
EDUCATIONAL TECHNOLOGY & SOCIETY, 2012, 15 (03): : 214 - 225
[49] Online Learning Framework based on user-centric access behavior
Huang, Guohao
Jiang, Hao
Xie, Jing
Zeng, Yuanyuan
Yi, Shuwen
IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 290 - 297
[50] An Online Learning Approach to Sequential User-Centric Selection Problems
Chen, Junpu
Xie, Hong
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6231 - 6238

← 1 2 3 4 5 →