Information Sharing in Distributed Stochastic Bandits

被引:0
|
作者
Buccapatnam, Swapna [1 ,2 ]
Tan, Jian [2 ]
Zhang, Li [2 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
MULTIARMED BANDIT;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Information sharing is an important issue for stochastic bandit problems in a distributed setting. Consider N players dealing with the same multi-armed bandit problem. All players receive requests simultaneously and must choose one of M actions for each request. Sharing information among these N players can decrease the regret for each of them but also incurs cooperation and communication overhead. In this setting, we study how cooperation and communication can impact the system performance measured by regret and communication cost. For both scenarios, we establish a uniform lower bound to the regret for the entire system as a function of time and network size. Concerning cooperation, we study the problem from a game-theoretic perspective. When each player's actions and payoffs are immediately visible to all others, we identify strategies for all players under which co-operative exploration is ensured. Regarding the communication cost, we consider incomplete information sharing such that a player's payoffs and actions are not entirely available to others. The players communicate observations to each other to reduce their regret, however with a cost. We show that a logarithmic communication cost is necessary to achieve the optimal regret. For Bernoulli arrivals, we specify a policy that achieves the optimal regret with a logarithmic communication cost. Our work opens a novel direction towards understanding information sharing for active learning in a distributed environment.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Sharing Private Information Across Distributed Databases
    Siegenthaler, Michael
    Birman, Ken
    2009 8TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS, 2009, : 82 - 89
  • [22] Information and knowledge sharing for distributed design agents
    McAlinden, LP
    Florida-James, BO
    Chao, KM
    Norman, PW
    Hills, W
    Smith, P
    ARTIFICIAL INTELLIGENCE IN DESIGN '98, 1998, : 537 - 556
  • [23] Protecting Information Sharing in Distributed Collaborative Environment
    Li, Min
    Wang, Hua
    ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, 2008, 4977 : 192 - 200
  • [24] Secure Information Brokering and Sharing in Distributed Systems
    Kumar, G. Siva
    Babu, K. Mahesh
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (11): : 107 - 111
  • [25] Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability
    Fryer, Roland
    Harms, Philipp
    MATHEMATICS OF OPERATIONS RESEARCH, 2018, 43 (02) : 399 - 427
  • [26] Decentralized Cooperative Stochastic Bandits
    Martinez-Rubio, David
    Kanade, Varun
    Rebeschini, Patrick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [27] Nonparametric Stochastic Contextual Bandits
    Guan, Melody Y.
    Jiang, Heinrich
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3119 - 3125
  • [28] Fairness of Exposure in Stochastic Bandits
    Wang, Lequn
    Bai, Yiwei
    Sun, Wen
    Joachims, Thorsten
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7700 - 7709
  • [29] Stochastic Bandits with Pathwise Constraints
    Avner, Orly
    Mannor, Shie
    2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 3862 - 3869
  • [30] Safe Linear Stochastic Bandits
    Khezeli, Kia
    Bitar, Eilyan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10202 - 10209