Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems

被引：0

作者：

Cai, Tianchi ^{[1
]}

Bao, Shenliao ^{[1
]}

Jiang, Jiyan ^{[2
]}

Zhou, Shiji ^{[2
]}

Zhang, Wenpeng ^{[1
]}

Gu, Lihong ^{[1
]}

Gu, Jinjie ^{[1
]}

Zhang, Guannan ^{[1
]}

机构：

[1] Ant Grp, Hangzhou, Peoples R China

[2] Tsinghua Univ, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023 | 2023年

关键词：

Recommender System; Reinforcement Learning;

D O I：

10.1145/3539618.3592022

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model-free RL-based recommender systems have recently received increasing research attention due to their capability to handle partial feedback and long-term rewards. However, most existing research has ignored a critical feature in recommender systems: one user's feedback on the same item at different times is random. The stochastic rewards property essentially differs from that in classic RL scenarios with deterministic rewards, which makes RL-based recommender systems much more challenging. In this paper, we first demonstrate in a simulator environment where using direct stochastic feedback results in a significant drop in performance. Then to handle the stochastic feedback more efficiently, we design two stochastic reward stabilization frameworks that replace the direct stochastic feedback with that learned by a supervised model. Both frameworks are model-agnostic, i.e., they can effectively utilize various supervised models. We demonstrate the superiority of the proposed frameworks over different RL-based recommendation baselines with extensive experiments on a recommendation simulator as well as an industrial-level recommender system.

引用

页码：2179 / 2183

页数：5

共 50 条

[41] Control of neural systems at multiple scales using model-free, deep reinforcement learning
Mitchell, B. A.
Petzold, L. R.
SCIENTIFIC REPORTS, 2018, 8
[42] A set-based model-free reinforcement learning design technique for nonlinear systems
Guay, Martin
Atta, Khalid Tourkey
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2019, 33 (02) : 315 - 334
[43] Model-Free Reinforcement Learning for Fully Cooperative Consensus Problem of Nonlinear Multiagent Systems
Wang, Hong
Li, Man
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1482 - 1491
[44] Model-Free Learning for Massive MIMO Systems: Stochastic Approximation Adjoint Iterative Learning Control
Aarnoudse, Leontine
Oomen, Tom
IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (06): : 1946 - 1951
[45] Model-Free Reinforcement Learning by Embedding an Auxiliary System for Optimal Control of Nonlinear Systems
Xu, Zhenhui
Shen, Tielong
Cheng, Daizhan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1520 - 1534
[46] Control of neural systems at multiple scales using model-free, deep reinforcement learning
B. A. Mitchell
L. R. Petzold
Scientific Reports, 8
[47] Resource management of cloud-enabled systems using model-free reinforcement learning
Yue Jin
Makram Bouzid
Dimitre Kostadinov
Armen Aghasaryan
Annals of Telecommunications, 2019, 74 : 625 - 636
[48] A set-based model-free reinforcement learning design technique for nonlinear systems
Guay, Martin
Atta, Khalid Tourkey
IFAC PAPERSONLINE, 2018, 51 (18): : 37 - 42
[49] Resource management of cloud-enabled systems using model-free reinforcement learning
Jin, Yue
Bouzid, Makram
Kostadinov, Dimitre
Aghasaryan, Armen
ANNALS OF TELECOMMUNICATIONS, 2019, 74 (9-10) : 625 - 636
[50] Model-free learning control of neutralization processes using reinforcement learning
Syafiie, S.
Tadeo, F.
Martinez, E.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (06) : 767 - 782

← 1 2 3 4 5 →