Minimax Policy for Heavy-Tailed Bandits

被引:3
|
作者
Wei, Lai [1 ]
Srivastava, Vaibhav [1 ]
机构
[1] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48823 USA
来源
IEEE CONTROL SYSTEMS LETTERS | 2021年 / 5卷 / 04期
关键词
Heavy-tailed distribution; stochastic MAB; worst-case regret; minimax policy;
D O I
10.1109/LCSYS.2020.3035767
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the stochastic Multi-Armed Bandit (MAB) problem under worst-case regret and heavytailed reward distribution. We modify the minimax policy MOSS for the sub-Gaussian reward distribution by using saturated empirical mean to design a new algorithm called Robust MOSS. We show that if the moment of order 1 + epsilon for the reward distribution exists, then the refined strategy has a worst-case regret matching the lower bound while maintaining a distribution-dependent logarithm regret.
引用
收藏
页码:1423 / 1428
页数:6
相关论文
共 50 条
  • [21] Renewal reward processes with heavy-tailed inter-renewal times and heavy-tailed rewards
    Levy, JB
    Taqqu, MS
    BERNOULLI, 2000, 6 (01) : 23 - 44
  • [22] Heavy-tailed log hydraulic conductivity distributions imply heavy-tailed log velocity distributions
    Kohlbecker, MV
    Wheatcraft, SW
    Meerschaert, MM
    WATER RESOURCES RESEARCH, 2006, 42 (04)
  • [23] On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
    Bedi, Amrit Singh
    Parayil, Anjaly
    Zhang, Junyu
    Wang, Mengdi
    Koppel, Alec
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [24] Minimax Optimal Bandits for Heavy Tail Rewards
    Lee, Kyungjae
    Lim, Sungbin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 35 (04) : 4899 - 4901
  • [25] Heavy-tailed Independent Component Analysis
    Anderson, Joseph
    Goyal, Navin
    Nandi, Anupama
    Rademacher, Luis
    2015 IEEE 56TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, 2015, : 290 - 309
  • [26] A PARAMETRIC BOOTSTRAP FOR HEAVY-TAILED DISTRIBUTIONS
    Cornea-Madeira, Adriana
    Davidson, Russell
    ECONOMETRIC THEORY, 2015, 31 (03) : 449 - 470
  • [27] Estimating the Mean of Heavy-Tailed Distributions
    Joachim Johansson
    Extremes, 2003, 6 (2) : 91 - 109
  • [28] Search heuristics and heavy-tailed behaviour
    Hulubei, T
    O'Sullivan, B
    PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING - CP 2005, PROCEEDINGS, 2005, 3709 : 328 - 342
  • [29] Asymptotic Expansions for Heavy-Tailed Data
    Pastor, Giancarlo
    Mora-Jimenez, Inmaculada
    Caamano, Antonio J.
    Jantti, Riku
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (04) : 444 - 448
  • [30] HEAVY-TAILED BAYESIAN NONPARAMETRIC ADAPTATION
    Agapiou, Sergios
    Castillo, Ismael
    ANNALS OF STATISTICS, 2024, 52 (04): : 1433 - 1459