On the Minimax Regret for Online Learning with Feedback Graphs

被引:0
|
作者
Eldowa, Khaled [1 ]
Esposito, Emmanuel [1 ,2 ]
Cesari, Tommaso [3 ]
Cesa-Bianchi, Nicolo [1 ]
机构
[1] Univ Milan, Milan, Italy
[2] Ist Italiano Tecnol, Genoa, Italy
[3] Univ Ottawa, Ottawa, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is O (root alpha T lnK), where K is the number of actions, alpha is the independence number of the graph, and T is the time horizon. The root lnK factor is known to be necessary when alpha = 1 (the experts case). On the other hand, when alpha = K (the bandits case), the minimax rate is known to be T (root KT ), and a lower bound O (root alpha T ) is known to hold for any a. Our improved upper bound O root alpha T(1 + ln(K/alpha))) holds for any a and matches the lower bounds for bandits and experts, while interpolating intermediate cases. To prove this result, we use FTRL with q-Tsallis entropy for a carefully chosen value of q is an element of [1/2, 1) that varies with alpha. The analysis of this algorithm requires a new bound on the variance term in the regret. We also show how to extend our techniques to timevarying graphs, without requiring prior knowledge of their independence numbers. Our upper bound is complemented by an improved Omega ( root alpha T(lnK)/(ln alpha)) lower bound for all alpha > 1, whose analysis relies on a novel reduction to multitask learning. This shows that a logarithmic factor is necessary as soon as alpha < K.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Existence and stability of minimax regret equilibria
    Zhe Yang
    Yong Jian Pu
    Journal of Global Optimization, 2012, 54 : 17 - 26
  • [42] Minimax regret priors for efficiency estimation
    Tsionas, Mike G.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 309 (03) : 1279 - 1285
  • [43] Minimax Regret Path Location on Trees
    Puerto, Justo
    Ricca, Federica
    Scozzari, Andrea
    NETWORKS, 2011, 58 (02) : 147 - 158
  • [44] Intertemporal Pricing Under Minimax Regret
    Caldentey, Rene
    Liu, Ying
    Lobel, Ilan
    OPERATIONS RESEARCH, 2017, 65 (01) : 104 - 129
  • [45] Axioms for minimax regret choice correspondences
    Stoye, Joerg
    JOURNAL OF ECONOMIC THEORY, 2011, 146 (06) : 2226 - 2251
  • [46] Minimax Regret for Stochastic Shortest Path
    Cohen, Alon
    Efroni, Yonathan
    Mansour, Yishay
    Rosenberg, Aviv
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [47] Revenue management with minimax regret negotiations
    Ayvaz-Cavdaroglu, Nur
    Kachani, Soulaymane
    Maglaras, Costis
    OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2016, 63 : 12 - 22
  • [48] Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs
    Ito, Shinji
    Tsuchiya, Taira
    Honda, Junya
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [49] Online Learning With Randomized Feedback Graphs for Optimal PUE Attacks in Cognitive Radio Networks
    Dabaghchian, Monireh
    Alipour-Fanid, Amir
    Zeng, Kai
    Wang, Qingsi
    Auer, Peter
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2018, 26 (05) : 2268 - 2281
  • [50] Online Learning for Predictive Control with Provable Regret Guarantees
    Muthirayan, Deepan
    Yuan, Jianjun
    Kalathil, Dileep
    Khargonekar, Pramod P.
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 6666 - 6671