DEN-DQL: Quick Convergent Deep Q-Learning with Double Exploration Networks for News Recommendation

被引:0
|
作者
Song, Zhanghan [1 ]
Zhang, Dian [1 ]
Shi, Xiaochuan [1 ]
Li, Wei [2 ]
Ma, Chao [1 ]
Wu, Libing [1 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan, Peoples R China
[2] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi, Jiangsu, Peoples R China
来源
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2021年
关键词
Reinforcement Learning; Deep Q-Learning; News Recommendation; Double Exploration Networks;
D O I
10.1109/IJCNN52387.2021.9533818
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the dynamic characteristics of news and user preferences, personalized recommendation is a challenging problem. Traditional recommendation methods simply focus on current reward, which just recommend items to maximize the number of current clicks. And this may reduce users' interest in similar items. Although the news recommendation framework based on deep reinforcement learning preciously proposed (i.e, DRL, based on deep Q-learning) has the advantages of focusing on future total rewards and dynamic interactive recommendation, it has two issues. First, its exploration method is slow to converge, which may bring new users a bad experience. Second, it is hard to train on off-line data set because the reward is difficult to be determined. In order to address the aforementioned issues, we propose a framework named DEN-DQL for news recommendation based on deep Q-learning with double exploration networks. Also, we develop a new method to calculate rewards and use an off-line data set to simulate the online news clicking environment to train DEN-DQL. Then, the well trained DEN-DQL is tested in the online environment of the same data set, which demonstrates at least 10% improvement of the proposed DEN-DQL.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Double-deep Q-learning to increase the efficiency of metasurface holograms
    Sajedian, Iman
    Lee, Heon
    Rho, Junsuk
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [22] Erosion-safe operation using double deep Q-learning
    Visbech, Jens
    Gocmen, Tuhfe
    Rethore, Pierre-Elouan
    Hasager, Charlotte Bay
    SCIENCE OF MAKING TORQUE FROM WIND, TORQUE 2024, 2024, 2767
  • [23] Double-deep Q-learning to increase the efficiency of metasurface holograms
    Iman Sajedian
    Heon Lee
    Junsuk Rho
    Scientific Reports, 9
  • [24] Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks
    L. A. Prashanth
    Abhranil Chatterjee
    Shalabh Bhatnagar
    Wireless Networks, 2014, 20 : 2589 - 2604
  • [25] Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks
    Prashanth, L. A.
    Chatterjee, Abhranil
    Bhatnagar, Shalabh
    WIRELESS NETWORKS, 2014, 20 (08) : 2589 - 2604
  • [26] Improved Exploration Strategy for Q-Learning Based Multipath Routing in SDN Networks
    Hassen, Houda
    Meherzi, Soumaya
    Jemaa, Zouhair Ben
    JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2024, 32 (02)
  • [27] Improved Exploration Strategy for Q-Learning Based Multipath Routing in SDN Networks
    Houda Hassen
    Soumaya Meherzi
    Zouhair Ben Jemaa
    Journal of Network and Systems Management, 2024, 32
  • [28] Multi-Agent Exploration for Faster and Reliable Deep Q-Learning Convergence in Reinforcement Learning
    Majumdar, Abhijit
    Benavidez, Patrick
    Jamshidi, Mo
    2018 WORLD AUTOMATION CONGRESS (WAC), 2018, : 222 - 227
  • [29] A Double Deep Q-Learning Model for Energy-Efficient Edge Scheduling
    Zhang, Qingchen
    Lin, Man
    Yang, Laurence T.
    Chen, Zhikui
    Khan, Samee U.
    Li, Peng
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2019, 12 (05) : 739 - 749
  • [30] Inverse design of the MMI power splitter by asynchronous double deep Q-learning
    Xu, Xiaopeng
    Li, Yu
    Huang, Weiping
    OPTICS EXPRESS, 2021, 29 (22) : 35951 - 35964