No-Regret Linear Bandits beyond Realizability

被引：0

作者：

Liu, Chong ^{[1
]}

Yin, Ming ^{[1
]}

Wang, Yu-Xiang ^{[1
]}

机构：

[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA

来源：

UNCERTAINTY IN ARTIFICIAL INTELLIGENCE | 2023年 / 216卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study linear bandits when the underlying reward function is not linear. Existing work relies on a uniform misspecification parameter epsilon that measures the sup-norm error of the best linear approximation. This results in an unavoidable linear regret whenever epsilon > 0. We describe a more natural model of misspecification which only requires the approximation error at each input x to be proportional to the suboptimality gap at x. It captures the intuition that, for optimization problems, near-optimal regions should matter more and we can tolerate larger approximation errors in suboptimal regions. Quite surprisingly, we show that the classical Lin-UCB algorithm - designed for the realizable case - is automatically robust against such gap-adjusted misspecification. It achieves a near-optimal root T regret for problems that the best-known regret is almost linear in time horizon T. Technically, our proof relies on a novel self-bounding argument that bounds the part of the regret due to misspecification by the regret itself.

引用

页码：1294 / 1303

页数：10

共 50 条

[41] No-regret Algorithms for Fair Resource Allocation
Sinha, Abhishek
Joshi, Ativ
Bhattacharjee, Rajarshi
Musco, Cameron
Hajiesmaili, Mohammad
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[42] Beyond log2(T) Regret for Decentralized Bandits in Matching Markets
Basu, Soumya
Sankararaman, Karthik Abinav
Sankararaman, Abishek
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[43] Manipulation Game Considering No-Regret Strategies
Clempner, Julio B.
MATHEMATICS, 2025, 13 (02)
[44] No-regret bayesian optimization with unknown hyperparameters
Berkenkamp, Felix
Schoellig, Angela P.
Krause, Andreas
Journal of Machine Learning Research, 2019, 20
[45] No Regret Bound for Extreme Bandits
Nishihara, Robert
Lopez-Paz, David
Bottou, Leon
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 259 - 267
[46] No-Regret Bayesian Optimization with Unknown Hyperparameters
Berkenkamp, Felix
Schoellig, Angela P.
Krause, Andreas
JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
[47] Weighted Voting Via No-Regret Learning
Haghtalab, Nika
Noothigattu, Ritesh
Procaccia, Ariel D.
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 1055 - 1062
[48] No-regret Exploration in Contextual Reinforcement Learning
Modi, Aditya
Tewari, Ambuj
CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 829 - 838
[49] No-Regret Online Prediction with Strategic Experts
Sadeghi, Omid
Fazel, Maryam
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[50] The Pareto Regret Frontier for Bandits
Lattimore, Tor
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28

← 1 2 3 4 5 →