Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits

被引:0
|
作者
Baudry, Dorian [1 ]
Pesquerel, Fabien [2 ]
Degenne, Remy [2 ]
Maillard, Odalric-Ambrym [2 ]
机构
[1] Ecole Polytech, CREST, Palaiseau, France
[2] Univ Lille, Cent Lille, CNRS, Inria,UMR 9189,CRIStAL, F-59000 Lille, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of regret minimization in non-parametric stochastic bandits. When the rewards are known to be bounded from above, there exists asymptotically optimal algorithms, with asymptotic regret depending on an infimum of Kullback-Leibler divergences (KL). These algorithms are computationally expensive and require storing all past rewards, thus simpler but non-optimal algorithms are often used instead. We introduce several methods to approximate the infimum KL which reduce drastically the computational and memory costs of existing optimal algorithms, while keeping their regret guaranties. We apply our findings to design new variants of the MED and IMED algorithms, and demonstrate their interest with extensive numerical simulations.
引用
收藏
页数:46
相关论文
共 50 条
  • [31] Parametric modeling of DSC-MRI data with stochastic filtration and optimal input design versus non-parametric modeling
    Kalicka, Renata
    Pietrenko-Dabrowska, Anna
    ANNALS OF BIOMEDICAL ENGINEERING, 2007, 35 (03) : 453 - 464
  • [32] A fast non-parametric test of association for multiple traits
    Garrido-Martin, Diego
    Calvo, Miquel
    Reverter, Ferran
    Guigo, Roderic
    GENOME BIOLOGY, 2023, 24 (01)
  • [33] A fast non-parametric test of association for multiple traits
    Diego Garrido-Martín
    Miquel Calvo
    Ferran Reverter
    Roderic Guigó
    Genome Biology, 24
  • [34] Non-parametric techniques for fast and robust stereo matching
    Banks, J
    Bennamoun, M
    Corke, P
    IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 365 - 368
  • [35] Parametric vs non-parametric optimal design of induction heating devices
    Di Barba, P.
    Dughiero, F.
    Forzan, M.
    Sieni, E.
    INTERNATIONAL JOURNAL OF APPLIED ELECTROMAGNETICS AND MECHANICS, 2014, 44 (02) : 193 - 199
  • [36] Optimal learning for sequential sampling with non-parametric beliefs
    Barut, Emre
    Powell, Warren B.
    JOURNAL OF GLOBAL OPTIMIZATION, 2014, 58 (03) : 517 - 543
  • [37] NON-PARAMETRIC PENALTY METHOD IN OPTIMAL CONTROL PROBLEMS
    GONCHAROVA, IF
    SAVITSKIY, YZ
    ENGINEERING CYBERNETICS, 1971, 9 (05): : 795 - +
  • [38] Non-parametric optimal service pricing: a simulation study
    Muzaffar, Asif
    Deng, Shiming
    Rashid, Ammar
    QUALITY TECHNOLOGY AND QUANTITATIVE MANAGEMENT, 2017, 14 (02): : 142 - 155
  • [39] Optimal sequential design in a controlled non-parametric regression
    Efromovich, Sam
    SCANDINAVIAN JOURNAL OF STATISTICS, 2008, 35 (02) : 266 - 285
  • [40] Non-parametric optimal design in dose finding studies
    O'Quigley, J
    Paoletti, X
    Maccario, J
    BIOSTATISTICS, 2002, 3 (01) : 51 - 56