Learning Automata Based Q-Learning for Content Placement in Cooperative Caching

被引：43

作者：

Yang, Zhong ^{[1
]}

Liu, Yuanwei ^{[1
]}

Chen, Yue ^{[1
]}

Jiao, Lei ^{[2
]}

机构：

[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England

[2] Univ Agder, Dept Informat & Commun Technol, N-4879 Grimstad, Norway

来源：

IEEE TRANSACTIONS ON COMMUNICATIONS | 2020年 / 68卷 / 06期

关键词：

Cooperative caching; Wireless communication; Prediction algorithms; Recurrent neural networks; Learning automata; Optimization; Quality of experience; Learning automata based Q-learning; quality of experience (QoE); wireless cooperative caching; user mobility prediction; content popularity prediction; NONORTHOGONAL MULTIPLE-ACCESS; WIRELESS; OPTIMIZATION; NETWORKS; DELIVERY;

D O I：

10.1109/TCOMM.2020.2982136

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing the sum mean opinion score (MOS) of mobile users. Firstly, as user mobility and content popularity have significant impacts on the user experience, a recurrent neural network (RNN) is invoked for user mobility prediction and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of user mobility prediction. Then, based on the predicted mobile users' positions and content popularity, a learning automata based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA based action selection scheme is capable of enabling every state to select the optimal action with arbitrary high probability if Q-learning is able to converge to the optimal Q value eventually. In the LAQL algorithm, a central processor acts as the intelligent agent, which allocate contents to BSs according to the reward or penalty from the feedback of the BSs and users, iteratively. To characterize the performance of the proposed LAQL algorithms, sum MOS of users is applied to define the reward function. Extensive simulation results reveal that: 1) the prediction error of RNNs based algorithm lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning algorithm; and 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%, respectively.

引用

页码：3667 / 3680

页数：14

共 50 条

[41] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
Ohnishi, Shota
Uchibe, Eiji
Yamaguchi, Yotaro
Nakanishi, Kosuke
Yasui, Yuji
Ishii, Shin
FRONTIERS IN NEUROROBOTICS, 2019, 13
[42] Sequential Q-Learning With Kalman Filtering for Multirobot Cooperative Transportation
Wang, Ying
de Silva, Clarence W.
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2010, 15 (02) : 261 - 268
[43] Evaluating cooperative-competitive dynamics with deep Q-learning
Kopacz, Aniko
Csato, Lehel
Chira, Camelia
NEUROCOMPUTING, 2023, 550
[44] Multi-robot Cooperative Planning by Consensus Q-learning
Sadhu, Arup Kumar
Konar, Amit
Banerjee, Bonny
Nagar, Atulya K.
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 4158 - 4164
[45] Cooperative Spectrum Sensing Using Q-Learning with Experimental Validation
Chen, Zhe
Qiu, Robert C.
IEEE SOUTHEASTCON 2011: BUILDING GLOBAL ENGINEERS, 2011, : 405 - 408
[46] Multi-agent Cooperative Alternating Q-learning Caching in D2D-enabled Cellular Networks
Fang, Xinyuan
Zhang, Tiankui
Liu, Yuanwei
Zeng, Zhimin
2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
[47] Q-learning based Reinforcement Learning Approach for Lane Keeping
Feher, Arpad
Aradi, Szilard
Becsi, Tamas
2018 18TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI), 2018, : 31 - 35
[48] Swarm Reinforcement Learning Method Based on Hierarchical Q-Learning
Kuroe, Yasuaki
Takeuchi, Kenya
Maeda, Yutaka
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[49] Q-Learning Based Aerial Base Station Placement for Fairness Enhancement in Mobile Networks
Ghanavi, Rozhina
Sabbaghian, Maryam
Yanikomeroglu, Halim
2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
[50] Adaptive Learning Recommendation Strategy Based on Deep Q-learning
Tan, Chunxi
Han, Ruijian
Ye, Rougang
Chen, Kani
APPLIED PSYCHOLOGICAL MEASUREMENT, 2020, 44 (04) : 251 - 266

← 1 2 3 4 5 →