Learning Automata Based Q-Learning for Content Placement in Cooperative Caching

被引:43
|
作者
Yang, Zhong [1 ]
Liu, Yuanwei [1 ]
Chen, Yue [1 ]
Jiao, Lei [2 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
[2] Univ Agder, Dept Informat & Commun Technol, N-4879 Grimstad, Norway
关键词
Cooperative caching; Wireless communication; Prediction algorithms; Recurrent neural networks; Learning automata; Optimization; Quality of experience; Learning automata based Q-learning; quality of experience (QoE); wireless cooperative caching; user mobility prediction; content popularity prediction; NONORTHOGONAL MULTIPLE-ACCESS; WIRELESS; OPTIMIZATION; NETWORKS; DELIVERY;
D O I
10.1109/TCOMM.2020.2982136
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing the sum mean opinion score (MOS) of mobile users. Firstly, as user mobility and content popularity have significant impacts on the user experience, a recurrent neural network (RNN) is invoked for user mobility prediction and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of user mobility prediction. Then, based on the predicted mobile users' positions and content popularity, a learning automata based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA based action selection scheme is capable of enabling every state to select the optimal action with arbitrary high probability if Q-learning is able to converge to the optimal Q value eventually. In the LAQL algorithm, a central processor acts as the intelligent agent, which allocate contents to BSs according to the reward or penalty from the feedback of the BSs and users, iteratively. To characterize the performance of the proposed LAQL algorithms, sum MOS of users is applied to define the reward function. Extensive simulation results reveal that: 1) the prediction error of RNNs based algorithm lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning algorithm; and 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%, respectively.
引用
收藏
页码:3667 / 3680
页数:14
相关论文
共 50 条
  • [41] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
    Ohnishi, Shota
    Uchibe, Eiji
    Yamaguchi, Yotaro
    Nakanishi, Kosuke
    Yasui, Yuji
    Ishii, Shin
    FRONTIERS IN NEUROROBOTICS, 2019, 13
  • [42] Sequential Q-Learning With Kalman Filtering for Multirobot Cooperative Transportation
    Wang, Ying
    de Silva, Clarence W.
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2010, 15 (02) : 261 - 268
  • [43] Evaluating cooperative-competitive dynamics with deep Q-learning
    Kopacz, Aniko
    Csato, Lehel
    Chira, Camelia
    NEUROCOMPUTING, 2023, 550
  • [44] Multi-robot Cooperative Planning by Consensus Q-learning
    Sadhu, Arup Kumar
    Konar, Amit
    Banerjee, Bonny
    Nagar, Atulya K.
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 4158 - 4164
  • [45] Cooperative Spectrum Sensing Using Q-Learning with Experimental Validation
    Chen, Zhe
    Qiu, Robert C.
    IEEE SOUTHEASTCON 2011: BUILDING GLOBAL ENGINEERS, 2011, : 405 - 408
  • [46] Multi-agent Cooperative Alternating Q-learning Caching in D2D-enabled Cellular Networks
    Fang, Xinyuan
    Zhang, Tiankui
    Liu, Yuanwei
    Zeng, Zhimin
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [47] Q-learning based Reinforcement Learning Approach for Lane Keeping
    Feher, Arpad
    Aradi, Szilard
    Becsi, Tamas
    2018 18TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI), 2018, : 31 - 35
  • [48] Swarm Reinforcement Learning Method Based on Hierarchical Q-Learning
    Kuroe, Yasuaki
    Takeuchi, Kenya
    Maeda, Yutaka
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [49] Q-Learning Based Aerial Base Station Placement for Fairness Enhancement in Mobile Networks
    Ghanavi, Rozhina
    Sabbaghian, Maryam
    Yanikomeroglu, Halim
    2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
  • [50] Adaptive Learning Recommendation Strategy Based on Deep Q-learning
    Tan, Chunxi
    Han, Ruijian
    Ye, Rougang
    Chen, Kani
    APPLIED PSYCHOLOGICAL MEASUREMENT, 2020, 44 (04) : 251 - 266