Thompson Sampling Based Multi-Armed-Bandit Mechanism Using Neural Networks

被引:0
|
作者
Manisha, Padala [1 ]
Gujar, Sujit [1 ]
机构
[1] Int Inst Informat Technol, Hyderabad, Telangana, India
关键词
Mechanism Design; MAB; Neural Networks;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In many practical applications such as crowd-sourcing and online advertisement, use of mechanism design (auction-based mechanisms) depends upon inherent stochastic parameters which are unknown. These parameters are learnt using multi-armed bandit (MAB) algorithms. The mechanisms which incorporate MAB are referred to as Multi-Armed-Bandit Mechanisms. While most of the MAB mechanisms focus on frequentist approaches like upper confidence bound algorithms, recent work has shown that using Bayesian approaches like Thompson sampling results in mechanisms with better regret bounds; although lower regret is obtained at the cost of the mechanism ending up with a weaker game theoretic property i.e. Within-Period Dominant Strategy Incentive Compatibility (WP-DSIC). The existing payment rules used in the Thompson sampling based mechanisms may cause negative utility to the auctioneer. In addition, if we wish to minimize the cost to the auctioneer, it is very challenging to design payment rules that satisfy WP-DSIC while learning through Thompson sampling. In our work, we propose a data-driven approach for designing MAB-mechanisms. Specifically, we use neural networks for designing the payment rule which is WP-DSIC, while the allocation rule is modeled using Thompson sampling. Our results, in the setting of crowd-sourcing for recruiting quality workers, indicate that the learned payment rule guarantees better cost while maximizing the social welfare and also ensuring reduced variance in the utilities to the agents.
引用
收藏
页码:2111 / 2113
页数:3
相关论文
共 50 条
  • [41] Performance of TVWS-based LoRa Transmissions using Multi-Armed Bandit
    Askhedkar, Anjali R.
    Chaudhari, Bharat S.
    Saeed, Rashid A.
    Alhumyani, Hesham
    Alenizi, Abdullah
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2024, 15 (09) : 759 - 769
  • [42] Prioritized Experience Replay based on Multi-armed Bandit
    Liu, Ximing
    Zhu, Tianqing
    Jiang, Cuiqing
    Ye, Dayong
    Zhao, Fuqing
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 189
  • [43] Motion Planning as Online Learning: A Multi-Armed Bandit Approach to Kinodynamic Sampling-Based Planning
    Faroni, Marco
    Berenson, Dmitry
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10): : 6651 - 6658
  • [44] Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits
    Park, Hongju
    Faradonbeh, Mohamad Kazem Shirani
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2150 - 2155
  • [45] A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing
    Jain, Shweta
    Gujar, Sujit
    Bhat, Satyanath
    Zoeter, Onno
    Narahari, Y.
    ARTIFICIAL INTELLIGENCE, 2018, 254 : 44 - 63
  • [46] Gateway Selection in Millimeter Wave UAV Wireless Networks Using Multi-Player Multi-Armed Bandit
    Mohamed, Ehab Mahmoud
    Hashima, Sherief
    Aldosary, Abdallah
    Hatano, Kohei
    Abdelghany, Mahmoud Ahmed
    SENSORS, 2020, 20 (14) : 1 - 22
  • [47] Multi-Armed Bandit-Based Secure Routing in Air-Ground Integrated Networks
    Liu, Xiaoyuan
    Xu, Yang
    Liu, Jia
    Takakura, Hiroki
    Liu, Xiaoying
    Zheng, Kechen
    Shiratori, Norio
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [48] Active Learning on Heterogeneous Information Networks: A Multi-armed Bandit Approach
    Xin, Doris
    El-Kishky, Ahmed
    Liao, De
    Norick, Brandon
    Han, Jiawei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1350 - 1355
  • [49] A Multi-armed Bandit Approach to Distributed Robust Beamforming in Multicell Networks
    Zhang, Xinruo
    Nakhai, Mohammad Reza
    Ariffin, Wan Nur Suryani Firuz Wan
    2016 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2016,
  • [50] Autonomous Resource Allocation for dense LTE networks: A Multi Armed Bandit formulation
    Feki, Afef
    Capdevielle, Veronique
    2011 IEEE 22ND INTERNATIONAL SYMPOSIUM ON PERSONAL INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2011, : 66 - 70