Minimax Policy for Heavy-Tailed Bandits

被引:3
|
作者
Wei, Lai [1 ]
Srivastava, Vaibhav [1 ]
机构
[1] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48823 USA
来源
IEEE CONTROL SYSTEMS LETTERS | 2021年 / 5卷 / 04期
关键词
Heavy-tailed distribution; stochastic MAB; worst-case regret; minimax policy;
D O I
10.1109/LCSYS.2020.3035767
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the stochastic Multi-Armed Bandit (MAB) problem under worst-case regret and heavytailed reward distribution. We modify the minimax policy MOSS for the sub-Gaussian reward distribution by using saturated empirical mean to design a new algorithm called Robust MOSS. We show that if the moment of order 1 + epsilon for the reward distribution exists, then the refined strategy has a worst-case regret matching the lower bound while maintaining a distribution-dependent logarithm regret.
引用
收藏
页码:1423 / 1428
页数:6
相关论文
共 50 条
  • [31] CAUSAL DISCOVERY IN HEAVY-TAILED MODELS
    Gnecco, Nicola
    Meinshausen, Nicolai
    Peters, Jonas
    Engelke, Sebastian
    ANNALS OF STATISTICS, 2021, 49 (03): : 1755 - 1778
  • [32] The divisible sandpile with heavy-tailed variables
    Cipriani, Alessandra
    Hazra, Rajat Subhra
    Ruszel, Wioletta M.
    STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2018, 128 (09) : 3054 - 3081
  • [33] ON THE ACCURACY OF INFERENCE ON HEAVY-TAILED DISTRIBUTIONS
    Novak, S. Y.
    THEORY OF PROBABILITY AND ITS APPLICATIONS, 2014, 58 (03) : 509 - U202
  • [34] Queue management for the heavy-tailed traffics
    Nakashima, Takuo
    INTERNATIONAL JOURNAL OF SPACE-BASED AND SITUATED COMPUTING, 2012, 2 (04) : 201 - 208
  • [35] Quantizing Heavy-Tailed Data in Statistical Estimation: (Near) Minimax Rates, Covariate Quantization, and Uniform Recovery
    Chen, Junren
    Ng, Michael K.
    Wang, Di
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (03) : 2003 - 2038
  • [36] Measure for characterizing heavy-tailed networks
    Hill, Sam A.
    PHYSICAL REVIEW RESEARCH, 2021, 3 (02):
  • [37] Heavy-tailed configuration models at criticality
    Dhara, Souvik
    van der Hofstad, Remco
    van Leeuwaarden, Johan S. H.
    Sen, Sanchayan
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2020, 56 (03): : 1515 - 1558
  • [38] Appendix: A primer on heavy-tailed distributions
    Karl Sigman
    Queueing Systems, 1999, 33 : 261 - 275
  • [40] Heavy-tailed distributions in combinatorial search
    Gomes, CP
    Selman, B
    Crato, N
    PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING - CP 97, 1997, 1330 : 121 - 135