Decentralized Federated Reinforcement Learning for User-Centric Dynamic TFDD Control

被引：7

作者：

Yin, Ziyan ^{[1
]}

Wang, Zhe ^{[2
]}

Li, Jun ^{[1
]}

Ding, Ming ^{[3
]}

Chen, Wen ^{[4
]}

Jin, Shi ^{[5
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China

[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[3] CSIRO, Data61, Sydney, NSW 2015, Australia

[4] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China

[5] Southeast Univ, Natl Mobile Commun Res Lab, Nanjing 210096, Peoples R China

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2023年 / 17卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Heuristic algorithms; Resource management; Quality of service; Time-frequency analysis; Interference; Fading channels; Dynamic scheduling; Dynamic TFDD; decentralized partially observable Markov decision process; federated learning; multi-agent reinforcement learning; resource allocation; NETWORKS; OPTIMIZATION; MANAGEMENT; ALLOCATION; SYSTEMS; 5G;

D O I：

10.1109/JSTSP.2022.3221671

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The explosive growth of dynamic and heterogeneous data traffic brings great challenges for 5G and beyond mobile networks. To enhance the network capacity and reliability, we propose a learning-based dynamic time-frequency division duplexing (D-TFDD) scheme that adaptively allocates the uplink and downlink time-frequency resources of base stations (BSs) to meet the asymmetric and heterogeneous traffic demands while alleviating the inter-cell interference. We formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) that maximizes the long-term expected sum rate under the users' packet dropping ratio constraints. In order to jointly optimize the global resources in a decentralized manner, we propose a federated reinforcement learning (RL) algorithm named federated Wolpertinger deep deterministic policy gradient (FWDDPG) algorithm. The BSs decide their local time-frequency configurations through RL algorithms and achieve global training via exchanging local RL models with their neighbors under a decentralized federated learning framework. Specifically, to deal with the large-scale discrete action space of each BS, we adopt a DDPG-based algorithm to generate actions in a continuous space, and then utilize Wolpertinger policy to reduce the mapping errors from continuous action space back to discrete action space. Simulation results demonstrate the superiority of our proposed algorithm to the benchmark algorithms with respect to system sum rate.

引用

页码：40 / 53

页数：14

共 50 条

[21] User-centric portals for managed learning environments
Ling, B
Allison, C
13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2002, : 399 - 405
[22] User-centric incremental learning model of dynamic personal identification for mobile devices
Hsin-Chun Tsai
Bo-Wei Chen
Karunanithi Bharanitharan
Anand Paul
Jhing-Fa Wang
Hung-Chieh Tai
Multimedia Systems, 2015, 21 : 121 - 130
[23] User-centric incremental learning model of dynamic personal identification for mobile devices
Tsai, Hsin-Chun
Chen, Bo-Wei
Bharanitharan, Karunanithi
Paul, Anand
Wang, Jhing-Fa
Tai, Hung-Chieh
MULTIMEDIA SYSTEMS, 2015, 21 (01) : 121 - 130
[24] Learning user purchase intent from user-centric data
Lukose, Rajan
Li, Jiye
Zhou, Jing
Penmetsa, Satyanarayana Raju
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 673 - +
[25] User-Centric Trust based Identity as a Service for federated Cloud Environment
Samlinson, E.
Usha, M.
2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
[26] Decentralized Clustering and Beamforming Based on Interference Pricing in User-Centric Networks
Dai, Lingcheng
Zhang, Hongtao
Li, Zhengzheng
2019 IEEE 30TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2019, : 7 - 12
[27] Toward Decentralized Task Offloading and Resource Allocation in User-Centric MEC
Qin, Langtian
Lu, Hancheng
Chen, Yuang
Chong, Baolin
Wu, Feng
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 11807 - 11823
[28] User Association and Power Allocation for User-Centric Smart-Duplex Networks via Deep Reinforcement Learning
Wang, Dan
Li, Ran
Huang, Chuan
Xu, Xiaodong
Chen, Hao
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 2810 - 2815
[29] User-Centric Clustering in Cell-Free MIMO Networks using Deep Reinforcement Learning
Mendoza, Charmae Franchesca
Schwarz, Stefan
Rupp, Markus
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1036 - 1041
[30] User-Centric Association in Ultra-Dense mmWave Networks via Deep Reinforcement Learning
Xue, Qing
Sun, Yao
Wang, Jian
Feng, Gang
Yan, Li
Ma, Shaodan
IEEE COMMUNICATIONS LETTERS, 2021, 25 (11) : 3594 - 3598

← 1 2 3 4 5 →