Learning Automata Based Q-Learning for Content Placement in Cooperative Caching

被引:43
|
作者
Yang, Zhong [1 ]
Liu, Yuanwei [1 ]
Chen, Yue [1 ]
Jiao, Lei [2 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
[2] Univ Agder, Dept Informat & Commun Technol, N-4879 Grimstad, Norway
关键词
Cooperative caching; Wireless communication; Prediction algorithms; Recurrent neural networks; Learning automata; Optimization; Quality of experience; Learning automata based Q-learning; quality of experience (QoE); wireless cooperative caching; user mobility prediction; content popularity prediction; NONORTHOGONAL MULTIPLE-ACCESS; WIRELESS; OPTIMIZATION; NETWORKS; DELIVERY;
D O I
10.1109/TCOMM.2020.2982136
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing the sum mean opinion score (MOS) of mobile users. Firstly, as user mobility and content popularity have significant impacts on the user experience, a recurrent neural network (RNN) is invoked for user mobility prediction and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of user mobility prediction. Then, based on the predicted mobile users' positions and content popularity, a learning automata based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA based action selection scheme is capable of enabling every state to select the optimal action with arbitrary high probability if Q-learning is able to converge to the optimal Q value eventually. In the LAQL algorithm, a central processor acts as the intelligent agent, which allocate contents to BSs according to the reward or penalty from the feedback of the BSs and users, iteratively. To characterize the performance of the proposed LAQL algorithms, sum MOS of users is applied to define the reward function. Extensive simulation results reveal that: 1) the prediction error of RNNs based algorithm lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning algorithm; and 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%, respectively.
引用
收藏
页码:3667 / 3680
页数:14
相关论文
共 50 条
  • [31] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
    Wang, Yin-Hao
    Li, Tzuu-Hseng S.
    Lin, Chih-Jui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193
  • [32] Learning-Based Cooperative Content Caching Policy for Mobile Edge Computing
    Jiang, Wei
    Feng, Gang
    Qin, Shuang
    Liang, Ying-Chang
    ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [33] Regional Cooperative Multi-agent Q-learning Based on Potential Field
    Liu, Liang
    Li, Longshu
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 6, PROCEEDINGS, 2008, : 535 - 539
  • [34] Q-learning system based on cooperative least squares support vector machine
    School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China
    不详
    Zidonghua Xuebao Acta Auto. Sin., 2009, 2 (214-219):
  • [35] Q-LEARNING BASED THERAPY MODELING
    Jacak, Witold
    Proell, Karin
    EMSS 2009: 21ST EUROPEAN MODELING AND SIMULATION SYMPOSIUM, VOL II, 2009, : 204 - +
  • [36] DeepChunk: Deep Q-Learning for Chunk-Based Caching in Wireless Data Processing Networks
    Wang, Yimeng
    Li, Yongbo
    Lan, Tian
    Aggarwal, Vaneet
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2019, 5 (04) : 1034 - 1045
  • [37] Whittle Index-Based Q-Learning for Wireless Edge Caching With Linear Function Approximation
    Xiong, Guojun
    Wang, Shufan
    Li, Jian
    Singh, Rahul
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (05) : 4286 - 4301
  • [38] Virtual Machine Placement Via Q-Learning with Function Approximation
    Duong, Thai
    Chu, Yu-Jung
    Thinh Nguyen
    Chakareski, Jacob
    2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [39] A study on expertise of agents and its effects on cooperative Q-learning
    Araabi, Babak Nadjar
    Mastoureshgh, Sahar
    Ahmadabadi, Majid Nili
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (02): : 398 - 409
  • [40] Predicting and Preventing Coordination Problems in Cooperative Q-learning Systems
    Fulda, Nancy
    Ventura, Dan
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 780 - 785