No-Regret Linear Bandits beyond Realizability

被引:0
|
作者
Liu, Chong [1 ]
Yin, Ming [1 ]
Wang, Yu-Xiang [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study linear bandits when the underlying reward function is not linear. Existing work relies on a uniform misspecification parameter epsilon that measures the sup-norm error of the best linear approximation. This results in an unavoidable linear regret whenever epsilon > 0. We describe a more natural model of misspecification which only requires the approximation error at each input x to be proportional to the suboptimality gap at x. It captures the intuition that, for optimization problems, near-optimal regions should matter more and we can tolerate larger approximation errors in suboptimal regions. Quite surprisingly, we show that the classical Lin-UCB algorithm - designed for the realizable case - is automatically robust against such gap-adjusted misspecification. It achieves a near-optimal root T regret for problems that the best-known regret is almost linear in time horizon T. Technically, our proof relies on a novel self-bounding argument that bounds the part of the regret due to misspecification by the regret itself.
引用
收藏
页码:1294 / 1303
页数:10
相关论文
共 50 条
  • [41] No-regret Algorithms for Fair Resource Allocation
    Sinha, Abhishek
    Joshi, Ativ
    Bhattacharjee, Rajarshi
    Musco, Cameron
    Hajiesmaili, Mohammad
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [42] Beyond log2(T) Regret for Decentralized Bandits in Matching Markets
    Basu, Soumya
    Sankararaman, Karthik Abinav
    Sankararaman, Abishek
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [43] Manipulation Game Considering No-Regret Strategies
    Clempner, Julio B.
    MATHEMATICS, 2025, 13 (02)
  • [44] No-regret bayesian optimization with unknown hyperparameters
    Berkenkamp, Felix
    Schoellig, Angela P.
    Krause, Andreas
    Journal of Machine Learning Research, 2019, 20
  • [45] No Regret Bound for Extreme Bandits
    Nishihara, Robert
    Lopez-Paz, David
    Bottou, Leon
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 259 - 267
  • [46] No-Regret Bayesian Optimization with Unknown Hyperparameters
    Berkenkamp, Felix
    Schoellig, Angela P.
    Krause, Andreas
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [47] Weighted Voting Via No-Regret Learning
    Haghtalab, Nika
    Noothigattu, Ritesh
    Procaccia, Ariel D.
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 1055 - 1062
  • [48] No-regret Exploration in Contextual Reinforcement Learning
    Modi, Aditya
    Tewari, Ambuj
    CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 829 - 838
  • [49] No-Regret Online Prediction with Strategic Experts
    Sadeghi, Omid
    Fazel, Maryam
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [50] The Pareto Regret Frontier for Bandits
    Lattimore, Tor
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28