Thompson Sampling Based Multi-Armed-Bandit Mechanism Using Neural Networks

被引：0

作者：

Manisha, Padala ^{[1
]}

Gujar, Sujit ^{[1
]}

机构：

[1] Int Inst Informat Technol, Hyderabad, Telangana, India

来源：

AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS | 2019年

关键词：

Mechanism Design; MAB; Neural Networks;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In many practical applications such as crowd-sourcing and online advertisement, use of mechanism design (auction-based mechanisms) depends upon inherent stochastic parameters which are unknown. These parameters are learnt using multi-armed bandit (MAB) algorithms. The mechanisms which incorporate MAB are referred to as Multi-Armed-Bandit Mechanisms. While most of the MAB mechanisms focus on frequentist approaches like upper confidence bound algorithms, recent work has shown that using Bayesian approaches like Thompson sampling results in mechanisms with better regret bounds; although lower regret is obtained at the cost of the mechanism ending up with a weaker game theoretic property i.e. Within-Period Dominant Strategy Incentive Compatibility (WP-DSIC). The existing payment rules used in the Thompson sampling based mechanisms may cause negative utility to the auctioneer. In addition, if we wish to minimize the cost to the auctioneer, it is very challenging to design payment rules that satisfy WP-DSIC while learning through Thompson sampling. In our work, we propose a data-driven approach for designing MAB-mechanisms. Specifically, we use neural networks for designing the payment rule which is WP-DSIC, while the allocation rule is modeled using Thompson sampling. Our results, in the setting of crowd-sourcing for recruiting quality workers, indicate that the learned payment rule guarantees better cost while maximizing the social welfare and also ensuring reduced variance in the utilities to the agents.

引用

页码：2111 / 2113

页数：3

共 50 条

[21] Automatic Quality of Experience Management for WLAN Networks using Multi-Armed Bandit
Moura, Henrique D.
Macedo, Daniel Fernandes
Vieira, Marcos A. M.
2019 IFIP/IEEE SYMPOSIUM ON INTEGRATED NETWORK AND SERVICE MANAGEMENT (IM), 2019, : 279 - 288
[22] Multi-Armed-Bandit-Based Spectrum Scheduling Algorithms in Wireless Networks: A Survey
Li, Feng
Yu, Dongxiao
Yang, Huan
Yu, Jiguo
Karl, Holger
Cheng, Xiuzhen
IEEE WIRELESS COMMUNICATIONS, 2020, 27 (01) : 24 - 30
[23] Multi-armed bandit based distributed resilient consensus and its in social networks
Hou, Jian
Chen, Zhiyong
Zhang, Mingyue
Wang, Xiaomin
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2022, 359 (10): : 4997 - 5013
[24] Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures
Timothy Verstraeten
Eugenio Bargiacchi
Pieter J. K. Libin
Jan Helsen
Diederik M. Roijers
Ann Nowé
Scientific Reports, 10
[25] Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures
Verstraeten, Timothy
Bargiacchi, Eugenio
Libin, Pieter J. K.
Helsen, Jan
Roijers, Diederik M.
Nowe, Ann
SCIENTIFIC REPORTS, 2020, 10 (01)
[26] AdaptiveBandit: A Multi-armed Bandit Framework for Adaptive Sampling in Molecular Simulations
Perez, Adria
Herrera-Nieto, Pablo
Doerr, Stefan
De Fabritiis, Gianni
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2020, 16 (07) : 4685 - 4693
[27] Neural Architecture Search via Combinatorial Multi-Armed Bandit
Huang, Hanxun
Ma, Xingjun
Erfani, Sarah M.
Bailey, James
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[28] Asymptotic Performance of Thompson Sampling for Batched Multi-Armed Bandits
Kalkanli, Cem
Ozgur, Ayfer
IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (09) : 5956 - 5970
[29] Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits
Kalkanli, Cem
Ozgur, Ayfer
2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 539 - 544
[30] Thompson sampling for multi-armed bandits in big data environments
Kim, Min Kyong
Hwang, Beom Seuk
KOREAN JOURNAL OF APPLIED STATISTICS, 2024, 37 (05)

← 1 2 3 4 5 →