Piecewise Stationary Bandits under Risk Criteria

被引：0

作者：

Bhatt, Sujay ^{[1
]}

Fang, Guanhua ^{[2
]}

Li, Ping ^{[3
]}

机构：

[1] JP Morgan AI Res, New York, NY 10017 USA

[2] Fudan Univ, Sch Management, Shanghai, Peoples R China

[3] LinkedIn Ads, Bellevue, WA 98004 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206 | 2023年 / 206卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Piecewise stationary stochastic multi-armed bandits have been extensively explored in the risk-neutral and sub-Gaussian setting. In this work, we consider a multi-armed bandit framework in which the reward distributions are heavy-tailed and non-stationary, and evaluate the performance of algorithms using general risk criteria. Specifically, we make the following contributions: (i) We first propose a non-parametric change detection algorithm that can detect general distributional changes in heavy-tailed distributions. (ii) We then propose a truncation-based UCB-type bandit algorithm integrating the above regime change detection algorithm to minimize the regret of the non-stationary learning problem. (iii) Finally, we establish the regret bounds for the proposed bandit algorithm by characterizing the statistical properties of the general change detection algorithm, along with a novel regret analysis.

引用

页数：23

共 50 条

[1] Near-Optimal MNL Bandits Under Risk Criteria
Xi, Guangyu
Tao, Chao
Zhou, Yuan
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10397 - 10404
[2] NEAR-OPTIMAL ALGORITHMS FOR PIECEWISE-STATIONARY CASCADING BANDITS
Wang, Lingda
Zhou, Huozhi
Li, Bingcong
Varshney, Lay R.
Zhao, Zhizhen
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3365 - 3369
[3] Roving bandits and stationary bandits
Lee, S
FORBES, 1998, 161 (09): : 149 - +
[4] Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
Besson, Lilian
Kaufmann, Emilie
Maillard, Odalric-Ambrym
Seznec, Julien
Journal of Machine Learning Research, 2022, 23
[5] Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
Besson, Lilian
Kaufmann, Emilie
Maillard, Odalric-Ambrym
Seznec, Julien
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[6] Approximately Stationary Bandits with Knapsacks
Fikioris, Giannis
Tardos, Eva
THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
[7] A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits
Zhou, Huozhi
Wang, Lingda
Varshney, Lav R.
Lim, Ee-Peng
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6933 - 6940
[8] Non-stationary Bandits with Knapsacks
Liu, Shang
Jiang, Jiashuo
Li, Xiaocheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[9] Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret
Papadigenopoulos, Orestis
Caramanis, Constantine
Shakkottai, Sanjay
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[10] Non-stationary Bandits with Heavy Tail
Pan, Weici
Liu, Zhenhua
Performance Evaluation Review, 2024, 52 (02): : 33 - 35

← 1 2 3 4 5 →