Thompson Sampling Based Multi-Armed-Bandit Mechanism Using Neural Networks

被引:0
|
作者
Manisha, Padala [1 ]
Gujar, Sujit [1 ]
机构
[1] Int Inst Informat Technol, Hyderabad, Telangana, India
关键词
Mechanism Design; MAB; Neural Networks;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In many practical applications such as crowd-sourcing and online advertisement, use of mechanism design (auction-based mechanisms) depends upon inherent stochastic parameters which are unknown. These parameters are learnt using multi-armed bandit (MAB) algorithms. The mechanisms which incorporate MAB are referred to as Multi-Armed-Bandit Mechanisms. While most of the MAB mechanisms focus on frequentist approaches like upper confidence bound algorithms, recent work has shown that using Bayesian approaches like Thompson sampling results in mechanisms with better regret bounds; although lower regret is obtained at the cost of the mechanism ending up with a weaker game theoretic property i.e. Within-Period Dominant Strategy Incentive Compatibility (WP-DSIC). The existing payment rules used in the Thompson sampling based mechanisms may cause negative utility to the auctioneer. In addition, if we wish to minimize the cost to the auctioneer, it is very challenging to design payment rules that satisfy WP-DSIC while learning through Thompson sampling. In our work, we propose a data-driven approach for designing MAB-mechanisms. Specifically, we use neural networks for designing the payment rule which is WP-DSIC, while the allocation rule is modeled using Thompson sampling. Our results, in the setting of crowd-sourcing for recruiting quality workers, indicate that the learned payment rule guarantees better cost while maximizing the social welfare and also ensuring reduced variance in the utilities to the agents.
引用
收藏
页码:2111 / 2113
页数:3
相关论文
共 50 条
  • [21] Automatic Quality of Experience Management for WLAN Networks using Multi-Armed Bandit
    Moura, Henrique D.
    Macedo, Daniel Fernandes
    Vieira, Marcos A. M.
    2019 IFIP/IEEE SYMPOSIUM ON INTEGRATED NETWORK AND SERVICE MANAGEMENT (IM), 2019, : 279 - 288
  • [22] Multi-Armed-Bandit-Based Spectrum Scheduling Algorithms in Wireless Networks: A Survey
    Li, Feng
    Yu, Dongxiao
    Yang, Huan
    Yu, Jiguo
    Karl, Holger
    Cheng, Xiuzhen
    IEEE WIRELESS COMMUNICATIONS, 2020, 27 (01) : 24 - 30
  • [23] Multi-armed bandit based distributed resilient consensus and its in social networks
    Hou, Jian
    Chen, Zhiyong
    Zhang, Mingyue
    Wang, Xiaomin
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2022, 359 (10): : 4997 - 5013
  • [24] Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures
    Timothy Verstraeten
    Eugenio Bargiacchi
    Pieter J. K. Libin
    Jan Helsen
    Diederik M. Roijers
    Ann Nowé
    Scientific Reports, 10
  • [25] Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures
    Verstraeten, Timothy
    Bargiacchi, Eugenio
    Libin, Pieter J. K.
    Helsen, Jan
    Roijers, Diederik M.
    Nowe, Ann
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [26] AdaptiveBandit: A Multi-armed Bandit Framework for Adaptive Sampling in Molecular Simulations
    Perez, Adria
    Herrera-Nieto, Pablo
    Doerr, Stefan
    De Fabritiis, Gianni
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2020, 16 (07) : 4685 - 4693
  • [27] Neural Architecture Search via Combinatorial Multi-Armed Bandit
    Huang, Hanxun
    Ma, Xingjun
    Erfani, Sarah M.
    Bailey, James
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [28] Asymptotic Performance of Thompson Sampling for Batched Multi-Armed Bandits
    Kalkanli, Cem
    Ozgur, Ayfer
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (09) : 5956 - 5970
  • [29] Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits
    Kalkanli, Cem
    Ozgur, Ayfer
    2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 539 - 544
  • [30] Thompson sampling for multi-armed bandits in big data environments
    Kim, Min Kyong
    Hwang, Beom Seuk
    KOREAN JOURNAL OF APPLIED STATISTICS, 2024, 37 (05)