Learning Automata Based Q-Learning for Content Placement in Cooperative Caching

被引:43
|
作者
Yang, Zhong [1 ]
Liu, Yuanwei [1 ]
Chen, Yue [1 ]
Jiao, Lei [2 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
[2] Univ Agder, Dept Informat & Commun Technol, N-4879 Grimstad, Norway
关键词
Cooperative caching; Wireless communication; Prediction algorithms; Recurrent neural networks; Learning automata; Optimization; Quality of experience; Learning automata based Q-learning; quality of experience (QoE); wireless cooperative caching; user mobility prediction; content popularity prediction; NONORTHOGONAL MULTIPLE-ACCESS; WIRELESS; OPTIMIZATION; NETWORKS; DELIVERY;
D O I
10.1109/TCOMM.2020.2982136
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing the sum mean opinion score (MOS) of mobile users. Firstly, as user mobility and content popularity have significant impacts on the user experience, a recurrent neural network (RNN) is invoked for user mobility prediction and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of user mobility prediction. Then, based on the predicted mobile users' positions and content popularity, a learning automata based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA based action selection scheme is capable of enabling every state to select the optimal action with arbitrary high probability if Q-learning is able to converge to the optimal Q value eventually. In the LAQL algorithm, a central processor acts as the intelligent agent, which allocate contents to BSs according to the reward or penalty from the feedback of the BSs and users, iteratively. To characterize the performance of the proposed LAQL algorithms, sum MOS of users is applied to define the reward function. Extensive simulation results reveal that: 1) the prediction error of RNNs based algorithm lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning algorithm; and 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%, respectively.
引用
收藏
页码:3667 / 3680
页数:14
相关论文
共 50 条
  • [1] Q-Learning for Content Placement in Wireless Cooperative Caching
    Yang, Zhong
    Liu, Yuanwei
    Chen, Yue
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [2] Cooperative Q-Learning Based on Learning Automata
    Yang, Mao
    Tian, Yantao
    Qi, Xinyue
    2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS ( ICAL 2009), VOLS 1-3, 2009, : 1972 - 1977
  • [3] Expertness based cooperative Q-learning
    Ahmadabadi, MN
    Asadpour, M
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2002, 32 (01): : 66 - 76
  • [4] Q-Learning Based Content Placement Method for Dynamic Cloud Content Delivery Networks
    Liu, Yujie
    Lu, Dianjie
    Zhang, Guijuan
    Tian, Jie
    Xu, Weizhi
    IEEE ACCESS, 2019, 7 : 66384 - 66394
  • [5] Cooperative Q-Learning Based on Maturity of the Policy
    Yang, Mao
    Tian, Yantao
    Liu, Xiaomei
    2009 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-7, CONFERENCE PROCEEDINGS, 2009, : 1352 - 1356
  • [6] EFFECTS OF COMMUNICATION IN COOPERATIVE Q-LEARNING
    Darbyshire, Paul
    Wang, Dianhui
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (05): : 2113 - 2126
  • [7] Stochastic Game Based Cooperative Alternating Q-Learning Caching in Dynamic D2D Networks
    Zhang, Tiankui
    Fang, Xinyuan
    Wang, Ziduan
    Liu, Yuanwei
    Nallanathan, Arumugam
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (12) : 13255 - 13269
  • [8] Caching in Base Station with Recommendation via Q-Learning
    Guo, Kaiyang
    Yang, Chenyang
    Liu, Tingting
    2017 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2017,
  • [9] Multi-criteria expertness based cooperative Q-learning
    Pakizeh, Esmat
    Palhang, Maziar
    Pedram, Mir Mohsen
    APPLIED INTELLIGENCE, 2013, 39 (01) : 28 - 40
  • [10] Multi-criteria expertness based cooperative Q-learning
    Esmat Pakizeh
    Maziar Palhang
    Mir Mohsen Pedram
    Applied Intelligence, 2013, 39 : 28 - 40