On the Minimax Regret for Online Learning with Feedback Graphs

被引：0

作者：

Eldowa, Khaled ^{[1
]}

Esposito, Emmanuel ^{[1
,2
]}

Cesari, Tommaso ^{[3
]}

Cesa-Bianchi, Nicolo ^{[1
]}

机构：

[1] Univ Milan, Milan, Italy

[2] Ist Italiano Tecnol, Genoa, Italy

[3] Univ Ottawa, Ottawa, ON, Canada

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is O (root alpha T lnK), where K is the number of actions, alpha is the independence number of the graph, and T is the time horizon. The root lnK factor is known to be necessary when alpha = 1 (the experts case). On the other hand, when alpha = K (the bandits case), the minimax rate is known to be T (root KT ), and a lower bound O (root alpha T ) is known to hold for any a. Our improved upper bound O root alpha T(1 + ln(K/alpha))) holds for any a and matches the lower bounds for bandits and experts, while interpolating intermediate cases. To prove this result, we use FTRL with q-Tsallis entropy for a carefully chosen value of q is an element of [1/2, 1) that varies with alpha. The analysis of this algorithm requires a new bound on the variance term in the regret. We also show how to extend our techniques to timevarying graphs, without requiring prior knowledge of their independence numbers. Our upper bound is complemented by an improved Omega ( root alpha T(lnK)/(ln alpha)) lower bound for all alpha > 1, whose analysis relies on a novel reduction to multitask learning. This shows that a logarithmic factor is necessary as soon as alpha < K.

引用

页数：12

共 50 条

[21] Minimax regret and strategic uncertainty
Renou, Ludovic
Schlag, Karl H.
JOURNAL OF ECONOMIC THEORY, 2010, 145 (01) : 264 - 286
[22] MINIMAX REGRET AND WELFARE ECONOMICS
GROUT, P
JOURNAL OF PUBLIC ECONOMICS, 1978, 9 (03) : 405 - 410
[23] Market Exit and Minimax Regret
Umbhauer, Gisele
INTERNATIONAL GAME THEORY REVIEW, 2022, 24 (04)
[24] Implementation in minimax regret equilibrium
Renou, Ludovic
Schlag, Karl H.
GAMES AND ECONOMIC BEHAVIOR, 2011, 71 (02) : 527 - 533
[25] Precise minimax redundancy and regret
Drmota, M
Szpankowski, W
IEEE TRANSACTIONS ON INFORMATION THEORY, 2004, 50 (11) : 2686 - 2707
[26] Minimax Regret Optimization for Robust Machine Learning under Distribution Shift
Agarwal, Alekh
Zhang, Tong
CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
[27] Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality
Marinov, Teodor V.
Mohri, Mehryar
Zimmert, Julian
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[28] Online Learning of the Kalman Filter With Logarithmic Regret
Tsiamis, Anastasios
Pappas, George J.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 2774 - 2789
[29] Universal Online Convex Optimization With Minimax Optimal Second-Order Dynamic Regret
Gokcesu, Hakan
Kozat, Suleyman Serdar
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (06) : 3865 - 3880
[30] From minimax value to low-regret algorithms for online Markov decision processes
Guan, Peng
Raginsky, Maxim
Willett, Rebecca
2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 471 - 476

← 1 2 3 4 5 →