Constant or Logarithmic Regret in Asynchronous Multiplayer Bandits with Limited Communication

被引:0
|
作者
Richard, Hugo [1 ]
Boursier, Etienne [2 ]
Perchet, Vianney
机构
[1] Criteo AI Lab, FAIRPLAY Joint Team, Paris, France
[2] Univ Paris Saclay LMO, INRIA, Orsay, France
关键词
MULTIARMED BANDIT; REWARDS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiplayer bandits have recently garnered significant attention due to their relevance in cognitive radio networks. While the existing body of literature predominantly focuses on synchronous players, real-world radio networks, such as those in IoT applications, often feature asynchronous (i.e., randomly activated) devices. This highlights the need for addressing the more challenging asynchronous multiplayer bandits problem. Our first result shows that a natural extension of UCB achieves a minimax regret of O(root T log(T)) in the centralized setting. More significantly, we introduce Cautious Greedy, which uses O(log(T)) communications and whose instance-dependent regret is constant if the optimal policy assigns at least one player to each arm (a situation proven to occur when arm means are sufficiently close). Otherwise, the regret is, as usual, log(T) times the sum of some inverse sub-optimality gaps. We substantiate the optimality of Cautious Greedy through lower-bound analysis based on data-dependent terms. Therefore, we establish a strong baseline for asynchronous multiplayer bandits, at least with O(log(T)) communications.
引用
收藏
页数:43
相关论文
共 22 条
  • [21] A Decentralized Policy with Logarithmic Regret for a Class of Multi-Agent Multi-Armed Bandit Problems with Option Unavailability Constraints and Stochastic Communication Protocols
    Pankayaraj, Pathmanathan
    Maithripala, D. H. S.
    Berg, J. M.
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 5974 - 5979
  • [22] Constant Time-Headway Spacing Policy with Limited Communication Range for Discrete Time Platoon Systems
    Peters, Andrs A.
    Rojas, Alejandro J.
    IFAC PAPERSONLINE, 2020, 53 (02): : 15198 - 15203