Learning Automata Based Q-Learning for Content Placement in Cooperative Caching

被引：43

作者：

Yang, Zhong ^{[1
]}

Liu, Yuanwei ^{[1
]}

Chen, Yue ^{[1
]}

Jiao, Lei ^{[2
]}

机构：

[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England

[2] Univ Agder, Dept Informat & Commun Technol, N-4879 Grimstad, Norway

来源：

IEEE TRANSACTIONS ON COMMUNICATIONS | 2020年 / 68卷 / 06期

关键词：

Cooperative caching; Wireless communication; Prediction algorithms; Recurrent neural networks; Learning automata; Optimization; Quality of experience; Learning automata based Q-learning; quality of experience (QoE); wireless cooperative caching; user mobility prediction; content popularity prediction; NONORTHOGONAL MULTIPLE-ACCESS; WIRELESS; OPTIMIZATION; NETWORKS; DELIVERY;

D O I：

10.1109/TCOMM.2020.2982136

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing the sum mean opinion score (MOS) of mobile users. Firstly, as user mobility and content popularity have significant impacts on the user experience, a recurrent neural network (RNN) is invoked for user mobility prediction and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of user mobility prediction. Then, based on the predicted mobile users' positions and content popularity, a learning automata based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA based action selection scheme is capable of enabling every state to select the optimal action with arbitrary high probability if Q-learning is able to converge to the optimal Q value eventually. In the LAQL algorithm, a central processor acts as the intelligent agent, which allocate contents to BSs according to the reward or penalty from the feedback of the BSs and users, iteratively. To characterize the performance of the proposed LAQL algorithms, sum MOS of users is applied to define the reward function. Extensive simulation results reveal that: 1) the prediction error of RNNs based algorithm lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning algorithm; and 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%, respectively.

引用

页码：3667 / 3680

页数：14

共 50 条

[31] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
Wang, Yin-Hao
Li, Tzuu-Hseng S.
Lin, Chih-Jui
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193
[32] Learning-Based Cooperative Content Caching Policy for Mobile Edge Computing
Jiang, Wei
Feng, Gang
Qin, Shuang
Liang, Ying-Chang
ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
[33] Regional Cooperative Multi-agent Q-learning Based on Potential Field
Liu, Liang
Li, Longshu
ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 6, PROCEEDINGS, 2008, : 535 - 539
[34] Q-learning system based on cooperative least squares support vector machine
School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China
不详
Zidonghua Xuebao Acta Auto. Sin., 2009, 2 (214-219):
[35] Q-LEARNING BASED THERAPY MODELING
Jacak, Witold
Proell, Karin
EMSS 2009: 21ST EUROPEAN MODELING AND SIMULATION SYMPOSIUM, VOL II, 2009, : 204 - +
[36] DeepChunk: Deep Q-Learning for Chunk-Based Caching in Wireless Data Processing Networks
Wang, Yimeng
Li, Yongbo
Lan, Tian
Aggarwal, Vaneet
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2019, 5 (04) : 1034 - 1045
[37] Whittle Index-Based Q-Learning for Wireless Edge Caching With Linear Function Approximation
Xiong, Guojun
Wang, Shufan
Li, Jian
Singh, Rahul
IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (05) : 4286 - 4301
[38] Virtual Machine Placement Via Q-Learning with Function Approximation
Duong, Thai
Chu, Yu-Jung
Thinh Nguyen
Chakareski, Jacob
2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
[39] A study on expertise of agents and its effects on cooperative Q-learning
Araabi, Babak Nadjar
Mastoureshgh, Sahar
Ahmadabadi, Majid Nili
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (02): : 398 - 409
[40] Predicting and Preventing Coordination Problems in Cooperative Q-learning Systems
Fulda, Nancy
Ventura, Dan
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 780 - 785

← 1 2 3 4 5 →