Information Sharing in Distributed Stochastic Bandits

被引:0
|
作者
Buccapatnam, Swapna [1 ,2 ]
Tan, Jian [2 ]
Zhang, Li [2 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
MULTIARMED BANDIT;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Information sharing is an important issue for stochastic bandit problems in a distributed setting. Consider N players dealing with the same multi-armed bandit problem. All players receive requests simultaneously and must choose one of M actions for each request. Sharing information among these N players can decrease the regret for each of them but also incurs cooperation and communication overhead. In this setting, we study how cooperation and communication can impact the system performance measured by regret and communication cost. For both scenarios, we establish a uniform lower bound to the regret for the entire system as a function of time and network size. Concerning cooperation, we study the problem from a game-theoretic perspective. When each player's actions and payoffs are immediately visible to all others, we identify strategies for all players under which co-operative exploration is ensured. Regarding the communication cost, we consider incomplete information sharing such that a player's payoffs and actions are not entirely available to others. The players communicate observations to each other to reduce their regret, however with a cost. We show that a logarithmic communication cost is necessary to achieve the optimal regret. For Bernoulli arrivals, we specify a policy that achieves the optimal regret with a logarithmic communication cost. Our work opens a novel direction towards understanding information sharing for active learning in a distributed environment.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Stochastic Bandits with Linear Constraints
    Pacchiano, Aldo
    Ghavamzadeh, Mohammad
    Bartlett, Peter
    Jiang, Heinrich
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [32] Adversarial Attacks on Stochastic Bandits
    Jun, Kwang-Sung
    Li, Lihong
    Ma, Yuzhe
    Zhu, Xiaojin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [33] Contextual Bandits with Stochastic Experts
    Sen, Rajat
    Shanmugam, Karthikeyan
    Shakkottai, Sanjay
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [34] Stochastic Bandits with Context Distributions
    Kirschner, Johannes
    Krause, Andreas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [35] Distributed Bandits with Heterogeneous Agents
    Yang, Lin
    Chen, Yu-Zhen Janice
    Hajiemaili, Mohammad H.
    Lui, John C. S.
    Towsley, Don
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 200 - 209
  • [36] Distributed Stochastic Control of Incentive for Bike-Sharing Systems
    Shigemi, Kazuhide
    Tsumura, Koji
    IFAC PAPERSONLINE, 2022, 55 (30): : 260 - 265
  • [37] On Stochastic Dynamic Games with Delayed Sharing Information Structure
    Tavafoghi, Hamidreza
    Ouyang, Yi
    Teneketzis, Demosthenis
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 7002 - 7009
  • [38] DECENTRALIZED STOCHASTIC CONTROL WITH DELAYED SHARING INFORMATION PATTERN
    KURTARAN, B
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1979, 24 (04) : 656 - 657
  • [39] DECENTRALIZED STOCHASTIC CONTROL WITH DELAYED SHARING INFORMATION PATTERN
    KURTARAN, B
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1976, 21 (04) : 576 - 581
  • [40] A new information sharing mechanism based on distributed information storage model
    Ma, Xiaoxuan
    Huang, Yiping
    Yi, Junyan
    International Journal of Database Theory and Application, 2015, 8 (05): : 305 - 314