Budgeted Recommendation with Delayed Feedback

被引:0
|
作者
Liu, Kweiguu [1 ]
Maghsudi, Setareh [2 ]
Yokoo, Makoto [1 ]
机构
[1] Kyushu Univ, Fac Informat Sci & Elect Engn, Fukuoka 8190395, Japan
[2] Ruhr Univ Bochum, Fac Elect Engn & Informat Technol, D-44801 Bochum, Germany
来源
GOOD PRACTICES AND NEW PERSPECTIVES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 3, WORLDCIST 2024 | 2024年 / 987卷
关键词
Budget Constraints; Delayed Feedback; Online Learning; Resource Allocation;
D O I
10.1007/978-3-031-60221-4_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In a conventional contextual multi-armed bandit problem, the feedback (or reward) is immediately observable after an action. Nevertheless, delayed feedback arises in numerous real-life situations and is particularly crucial in time-sensitive applications. The exploration-exploitation dilemma becomes particularly challenging under such conditions, as it couples with the interplay between delays and limited resources. Besides, a limited budget often aggravates the problem by restricting the exploration potential. A motivating example is the distribution of medical supplies at the early stage of COVID-19. The delayed feedback of testing results, thus insufficient information for learning, degraded the efficiency of resource allocation. Motivated by such applications, we study the effect of delayed feedback on constrained contextual bandits. We develop a decision-making policy, delay-oriented resource allocation with learning (DORAL), to optimize the resource expenditure in a contextual multi-armed bandit problem with arm-dependent delayed feedback.
引用
收藏
页码:202 / 213
页数:12
相关论文
共 50 条
  • [1] Counterfactual contextual bandit for recommendation under delayed feedback
    Cai R.
    Lu R.
    Chen W.
    Hao Z.
    Neural Computing and Applications, 2024, 36 (23) : 14599 - 14613
  • [2] Counterfactual Reward Modification for Streaming Recommendation with Delayed Feedback
    Zhang, Xiao
    Jia, Haonan
    Su, Hanjing
    Wang, Wenhan
    Xu, Jun
    Wen, Ji-Rong
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 41 - 50
  • [3] Cascading Bandits: Optimizing Recommendation Frequency in Delayed Feedback Environments
    Wang, Dairui
    Cao, Junyu
    Zhang, Yan
    Qi, Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Delayed feedback
    Gorelik, G
    JOURNAL OF PHYSICS-USSR, 1939, 1 : 465 - 470
  • [5] Implicit Feedback Mining for Recommendation
    Song, Yan
    Yang, Ping
    Zhang, Chunhong
    Ji, Yang
    BIG DATA COMPUTING AND COMMUNICATIONS, 2015, 9196 : 373 - 385
  • [6] Denoising Implicit Feedback for Recommendation
    Wang, Wenjie
    Feng, Fuli
    He, Xiangnan
    Nie, Liqiang
    Chua, Tat-Seng
    WSDM '21: PROCEEDINGS OF THE 14TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2021, : 373 - 381
  • [7] Deep Feedback Network for Recommendation
    Xie, Ruobing
    Ling, Cheng
    Wang, Yalong
    Wang, Rui
    Xia, Feng
    Lin, Leyu
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2519 - 2525
  • [8] EFFECTS OF IMMEDIATE INFORMATION FEEDBACK AND DELAYED INFORMATION FEEDBACK ON DELAYED RETENTION
    BECK, FW
    LINDSEY, JD
    JOURNAL OF EDUCATIONAL RESEARCH, 1979, 72 (05): : 283 - 284
  • [9] EFFECTS OF DELAYED INFORMATION FEEDBACK AND FEEDBACK CUES IN LEARNING ON DELAYED RETENTION
    SASSENRA.JM
    YONGE, GD
    JOURNAL OF EDUCATIONAL PSYCHOLOGY, 1969, 60 (03) : 174 - &
  • [10] Learning with Delayed Feedback
    Pranavan, Theivendiram
    Sim, Terence
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4895 - 4902