Thompson Sampling Based Multi-Armed-Bandit Mechanism Using Neural Networks

被引:0
|
作者
Manisha, Padala [1 ]
Gujar, Sujit [1 ]
机构
[1] Int Inst Informat Technol, Hyderabad, Telangana, India
关键词
Mechanism Design; MAB; Neural Networks;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In many practical applications such as crowd-sourcing and online advertisement, use of mechanism design (auction-based mechanisms) depends upon inherent stochastic parameters which are unknown. These parameters are learnt using multi-armed bandit (MAB) algorithms. The mechanisms which incorporate MAB are referred to as Multi-Armed-Bandit Mechanisms. While most of the MAB mechanisms focus on frequentist approaches like upper confidence bound algorithms, recent work has shown that using Bayesian approaches like Thompson sampling results in mechanisms with better regret bounds; although lower regret is obtained at the cost of the mechanism ending up with a weaker game theoretic property i.e. Within-Period Dominant Strategy Incentive Compatibility (WP-DSIC). The existing payment rules used in the Thompson sampling based mechanisms may cause negative utility to the auctioneer. In addition, if we wish to minimize the cost to the auctioneer, it is very challenging to design payment rules that satisfy WP-DSIC while learning through Thompson sampling. In our work, we propose a data-driven approach for designing MAB-mechanisms. Specifically, we use neural networks for designing the payment rule which is WP-DSIC, while the allocation rule is modeled using Thompson sampling. Our results, in the setting of crowd-sourcing for recruiting quality workers, indicate that the learned payment rule guarantees better cost while maximizing the social welfare and also ensuring reduced variance in the utilities to the agents.
引用
收藏
页码:2111 / 2113
页数:3
相关论文
共 50 条
  • [31] Using Multi-Armed Bandit Learning for Thwarting MAC Layer Attacks in Wireless Networks
    Dutta, Hrishikesh
    Bhuyan, Amit Kumar
    Biswas, Subir
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024,
  • [32] Multi-Armed Bandit for Edge Computing in Dynamic Networks with Uncertainty
    Ghoorchian, Saeed
    Maghsudi, Setareh
    PROCEEDINGS OF THE 21ST IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (IEEE SPAWC2020), 2020,
  • [33] Learning the Truth by Weakly Connected Agents in Social Networks Using Multi-Armed Bandit
    Odeyomi, Olusola Tolulope
    IEEE ACCESS, 2020, 8 : 202090 - 202099
  • [34] A multi-armed bandit approach for exploring partially observed networks
    Kaushalya Madhawa
    Tsuyoshi Murata
    Applied Network Science, 4
  • [35] A multi-armed bandit approach for exploring partially observed networks
    Madhawa, Kaushalya
    Murata, Tsuyoshi
    APPLIED NETWORK SCIENCE, 2019, 4 (01)
  • [36] Pruning neural networks using multi-armed bandits
    Ameen S.
    Vadera S.
    Computer Journal, 2020, 63 (07): : 1099 - 1108
  • [37] Pruning Neural Networks Using Multi-Armed Bandits
    Ameen, Salem
    Vadera, Sunil
    COMPUTER JOURNAL, 2020, 63 (07): : 1099 - 1108
  • [38] Approximate Thompson Sampling via Epistemic Neural Networks
    Osband, Ian
    Wen, Zheng
    Asghari, Seyed Mohammad
    Dwaracherla, Vikranth
    Ibrahimi, Morteza
    Lu, Xiuyuan
    Van Roy, Benjamin
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1586 - 1595
  • [39] BandiTS: Dynamic Timing Speculation Using Multi-Armed Bandit Based Optimization
    Zhang, Jeff
    Garg, Siddharth
    PROCEEDINGS OF THE 2017 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2017, : 922 - 925
  • [40] Multi-User Communication Networks: A Coordinated Multi-Armed Bandit Approach
    Avner, Orly
    Mannor, Shie
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (06) : 2192 - 2207