Piecewise Stationary Bandits under Risk Criteria

被引:0
|
作者
Bhatt, Sujay [1 ]
Fang, Guanhua [2 ]
Li, Ping [3 ]
机构
[1] JP Morgan AI Res, New York, NY 10017 USA
[2] Fudan Univ, Sch Management, Shanghai, Peoples R China
[3] LinkedIn Ads, Bellevue, WA 98004 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Piecewise stationary stochastic multi-armed bandits have been extensively explored in the risk-neutral and sub-Gaussian setting. In this work, we consider a multi-armed bandit framework in which the reward distributions are heavy-tailed and non-stationary, and evaluate the performance of algorithms using general risk criteria. Specifically, we make the following contributions: (i) We first propose a non-parametric change detection algorithm that can detect general distributional changes in heavy-tailed distributions. (ii) We then propose a truncation-based UCB-type bandit algorithm integrating the above regime change detection algorithm to minimize the regret of the non-stationary learning problem. (iii) Finally, we establish the regret bounds for the proposed bandit algorithm by characterizing the statistical properties of the general change detection algorithm, along with a novel regret analysis.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Near-Optimal MNL Bandits Under Risk Criteria
    Xi, Guangyu
    Tao, Chao
    Zhou, Yuan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10397 - 10404
  • [2] NEAR-OPTIMAL ALGORITHMS FOR PIECEWISE-STATIONARY CASCADING BANDITS
    Wang, Lingda
    Zhou, Huozhi
    Li, Bingcong
    Varshney, Lay R.
    Zhao, Zhizhen
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3365 - 3369
  • [3] Roving bandits and stationary bandits
    Lee, S
    FORBES, 1998, 161 (09): : 149 - +
  • [4] Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
    Besson, Lilian
    Kaufmann, Emilie
    Maillard, Odalric-Ambrym
    Seznec, Julien
    Journal of Machine Learning Research, 2022, 23
  • [5] Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
    Besson, Lilian
    Kaufmann, Emilie
    Maillard, Odalric-Ambrym
    Seznec, Julien
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [6] Approximately Stationary Bandits with Knapsacks
    Fikioris, Giannis
    Tardos, Eva
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [7] A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits
    Zhou, Huozhi
    Wang, Lingda
    Varshney, Lav R.
    Lim, Ee-Peng
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6933 - 6940
  • [8] Non-stationary Bandits with Knapsacks
    Liu, Shang
    Jiang, Jiashuo
    Li, Xiaocheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [9] Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret
    Papadigenopoulos, Orestis
    Caramanis, Constantine
    Shakkottai, Sanjay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [10] Non-stationary Bandits with Heavy Tail
    Pan, Weici
    Liu, Zhenhua
    Performance Evaluation Review, 2024, 52 (02): : 33 - 35