Piecewise Stationary Bandits under Risk Criteria

被引:0
|
作者
Bhatt, Sujay [1 ]
Fang, Guanhua [2 ]
Li, Ping [3 ]
机构
[1] JP Morgan AI Res, New York, NY 10017 USA
[2] Fudan Univ, Sch Management, Shanghai, Peoples R China
[3] LinkedIn Ads, Bellevue, WA 98004 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Piecewise stationary stochastic multi-armed bandits have been extensively explored in the risk-neutral and sub-Gaussian setting. In this work, we consider a multi-armed bandit framework in which the reward distributions are heavy-tailed and non-stationary, and evaluate the performance of algorithms using general risk criteria. Specifically, we make the following contributions: (i) We first propose a non-parametric change detection algorithm that can detect general distributional changes in heavy-tailed distributions. (ii) We then propose a truncation-based UCB-type bandit algorithm integrating the above regime change detection algorithm to minimize the regret of the non-stationary learning problem. (iii) Finally, we establish the regret bounds for the proposed bandit algorithm by characterizing the statistical properties of the general change detection algorithm, along with a novel regret analysis.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Non-Stationary Representation Learning in Sequential Linear Bandits
    Qin, Yuzhen
    Menara, Tommaso
    Oymak, Samet
    Ching, Shinung
    Pasqualetti, Fabio
    IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2022, 1 : 41 - 56
  • [43] Multiple criteria optimization and decisions under risk
    Ogryczak, W
    CONTROL AND CYBERNETICS, 2002, 31 (04): : 975 - 1003
  • [44] Joint segmentation of a set of piecewise stationary processes
    Reboul, S
    Benjelloun, M
    ICCC 2004: SECOND IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL CYBERNETICS, PROCEEDINGS, 2004, : 191 - 195
  • [45] Safety Aware Changepoint Detection for Piecewise i.i.d. Bandits
    Mukherjee, Subhojyoti
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 1402 - 1412
  • [46] STATIONARY-PROCESSES WITH PIECEWISE MONOTONOUS TRAJECTORIES
    SCHMIDT, V
    MATHEMATISCHE NACHRICHTEN, 1983, 113 : 93 - 105
  • [47] PIECEWISE STATIONARY PERFECT RECONSTRUCTION FILTER BANKS
    THEUNIS, HGJ
    DEPRETTERE, EF
    AEU-ARCHIV FUR ELEKTRONIK UND UBERTRAGUNGSTECHNIK-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 1995, 49 (5-6): : 344 - 361
  • [48] On the Quenched Central Limit Theorem for Stationary Random Fields Under Projective Criteria
    Zhang, Na
    Reding, Lucas
    Peligrad, Magda
    JOURNAL OF THEORETICAL PROBABILITY, 2020, 33 (04) : 2351 - 2379
  • [49] Combinatorial Bandits under Strategic Manipulations
    Dong, Jing
    Li, Ke
    Li, Shuai
    Wang, Baoxiang
    WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 219 - 229
  • [50] On the Quenched Central Limit Theorem for Stationary Random Fields Under Projective Criteria
    Na Zhang
    Lucas Reding
    Magda Peligrad
    Journal of Theoretical Probability, 2020, 33 : 2351 - 2379