Handling Concept Drift in Non-stationary Bandit Through Predicting Future Rewards

被引:0
|
作者
Tsai, Yun-Da [1 ]
Lin, Shou-De [1 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
关键词
D O I
10.1007/978-981-97-2650-9_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a study on the non-stationary stochastic multi-armed bandit (MAB) problem, which is relevant for addressing real-world challenges related to sequential decision-making. Our work involves a thorough analysis of state-of-the-art algorithms in dynamically changing environments. To address the limitations of existing methods, we propose the Concept Drift Adaptive Bandit (CDAB) framework, which aims to capture and predict potential future concept drift patterns in reward distribution, allowing for better adaptation in non-stationary environments. We conduct extensive numerical experiments to evaluate the effectiveness of the CDAB approach in comparison to both stationary and non-stationary state-of-the-art baselines. Our experiments involve testing on both artificial datasets and real-world data under different types of changing environments. The results show that the CDAB approach exhibits strong empirical performance, outperforming existing methods in all versions tested.
引用
收藏
页码:161 / 173
页数:13
相关论文
共 50 条
  • [1] Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards
    Besbes, Omar
    Gur, Yonatan
    Zeevi, Assaf
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [2] Non-Stationary Delayed Combinatorial Semi-Bandit With Causally Related Rewards
    Ghoorchian, Saeed
    Bilaj, Steven
    Maghsudi, Setareh
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2025, 6 : 369 - 384
  • [3] DetectA: abrupt concept drift detection in non-stationary environments
    Escovedo, Tatiana
    Koshiyama, Adriano
    da Cruz, Andre Abs
    Vellasco, Marley
    APPLIED SOFT COMPUTING, 2018, 62 : 119 - 133
  • [4] Learning under concept drift and non-stationary noise: Introduction of the concept of persistence
    Coskun, Kutalmis
    Tumer, Borahan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
  • [5] Bandit Convex Optimization in Non-stationary Environments
    Zhao, Peng
    Wang, Guanghui
    Zhang, Lijun
    Zhou, Zhi-Hua
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1508 - 1517
  • [6] Bandit Convex Optimization in Non-stationary Environments
    Zhao, Peng
    Wang, Guanghui
    Zhang, Lijun
    Zhou, Zhi-Hua
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [7] Thompson Sampling for Non-Stationary Bandit Problems
    Qi, Han
    Guo, Fei
    Zhu, Li
    ENTROPY, 2025, 27 (01)
  • [8] Adaptive Ensemble Based Learning in Non-stationary Environments with Variable Concept Drift
    Susnjak, Teo
    Barczak, Andre L. C.
    Hawick, Ken A.
    NEURAL INFORMATION PROCESSING: THEORY AND ALGORITHMS, PT I, 2010, 6443 : 438 - 445
  • [9] Predicting non-stationary processes
    Ryabko, Daniil
    Hutter, Marcus
    APPLIED MATHEMATICS LETTERS, 2008, 21 (05) : 477 - 482
  • [10] An adaptive XGBoost-based optimized sliding window for concept drift handling in non-stationary spatiotemporal data streams classifications
    Angbera, Ature
    Chan, Huah Yong
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (06): : 7781 - 7811