Linear Thompson Sampling Revisited

被引:0
|
作者
Abeille, Marc [1 ]
Lazaric, Alessandro [1 ]
机构
[1] Inria Lille Nord Europe, Team SequeL, Villeneuve Dascq, France
来源
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54 | 2017年 / 54卷
关键词
BANDIT; REGRET;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We derive an alternative proof for the regret of Thompson sampling (TS) in the stochastic linear bandit setting. While we obtain a regret bound of order O (d(3/2)root T) as in previous results, the proof sheds new light on the functioning of the TS. We leverage on the structure of the problem to show how the regret is related to the sensitivity (i.e., the gradient) of the objective function and how selecting optimal arms associated to optimistic parameters does control it. Thus we show that TS can be seen as a generic randomized algorithm where the sampling distribution is designed to have a fixed probability of being optimistic, at the cost of an additional root d regret factor compared to a UCB-like approach. Furthermore, we show that our proof can be readily applied to regularized linear optimization and generalized linear model problems.
引用
收藏
页码:176 / 184
页数:9
相关论文
共 50 条
  • [1] Linear Thompson sampling revisited
    Abeille, Marc
    Lazaric, Alessandro
    ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 5165 - 5197
  • [2] LINEAR THOMPSON SAMPLING UNDER UNKNOWN LINEAR CONSTRAINTS
    Moradipari, Ahmadreza
    Alizadeh, Mahnoosh
    Thrampoulidis, Christos
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3392 - 3396
  • [3] THE LINEAR SAMPLING METHOD REVISITED
    Arens, Tilo
    Lechleiter, Armin
    JOURNAL OF INTEGRAL EQUATIONS AND APPLICATIONS, 2009, 21 (02) : 179 - 202
  • [4] Safe Linear Thompson Sampling With Side Information
    Moradipari, Ahmadreza
    Amani, Sanae
    Alizadeh, Mahnoosh
    Thrampoulidis, Christos
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 3755 - 3767
  • [5] Doubly Robust Thompson Sampling with Linear Payoffs
    Kim, Wonyoung
    Kim, Gi-Soo
    Paik, Myunghee Cho
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Control of Unknown Linear Systems with Thompson Sampling
    Ouyang, Yi
    Gagrani, Mukul
    Jain, Rahul
    2017 55TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2017, : 1198 - 1205
  • [7] Thompson Sampling for Linear-Quadratic Control Problems
    Abeille, Marc
    Lazaric, Alessandro
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 1246 - 1254
  • [8] Gain estimation of linear dynamical systems using Thompson Sampling
    Mueller, Matias, I
    Rojas, Cristian R.
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [9] Thompson Sampling for Partially Observable Linear-Quadratic Control
    Kargin, Taylan
    Lale, Sahin
    Azizzadenesheli, Kamyar
    Anandkumar, Anima
    Hassibi, Babak
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 4561 - 4568
  • [10] Noise-Adaptive Thompson Sampling for Linear Contextual Bandits
    Xu, Ruitu
    Min, Yifei
    Wang, Tianhao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,