Thompson Sampling Based Multi-Armed-Bandit Mechanism Using Neural Networks

被引：0

作者：

Manisha, Padala ^{[1
]}

Gujar, Sujit ^{[1
]}

机构：

[1] Int Inst Informat Technol, Hyderabad, Telangana, India

来源：

AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS | 2019年

关键词：

Mechanism Design; MAB; Neural Networks;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In many practical applications such as crowd-sourcing and online advertisement, use of mechanism design (auction-based mechanisms) depends upon inherent stochastic parameters which are unknown. These parameters are learnt using multi-armed bandit (MAB) algorithms. The mechanisms which incorporate MAB are referred to as Multi-Armed-Bandit Mechanisms. While most of the MAB mechanisms focus on frequentist approaches like upper confidence bound algorithms, recent work has shown that using Bayesian approaches like Thompson sampling results in mechanisms with better regret bounds; although lower regret is obtained at the cost of the mechanism ending up with a weaker game theoretic property i.e. Within-Period Dominant Strategy Incentive Compatibility (WP-DSIC). The existing payment rules used in the Thompson sampling based mechanisms may cause negative utility to the auctioneer. In addition, if we wish to minimize the cost to the auctioneer, it is very challenging to design payment rules that satisfy WP-DSIC while learning through Thompson sampling. In our work, we propose a data-driven approach for designing MAB-mechanisms. Specifically, we use neural networks for designing the payment rule which is WP-DSIC, while the allocation rule is modeled using Thompson sampling. Our results, in the setting of crowd-sourcing for recruiting quality workers, indicate that the learned payment rule guarantees better cost while maximizing the social welfare and also ensuring reduced variance in the utilities to the agents.

引用

页码：2111 / 2113

页数：3

共 50 条

[1] Thompson Sampling Based Mechanisms for Stochastic Multi-Armed Bandit Problems
Ghalme, Ganesh
Jain, Shweta
Gujar, Sujit
Narahari, Y.
AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 87 - 95
[2] Multi-Armed-Bandit Based Channel Selection Algorithm for Massive Heterogeneous Internet of Things Networks
Hasegawa, So
Kitagawa, Ryoma
Li, Aohan
Kim, Song-Ju
Watanabe, Yoshito
Shoji, Yozo
Hasegawa, Mikio
APPLIED SCIENCES-BASEL, 2022, 12 (15):
[3] Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards
Besbes, Omar
Gur, Yonatan
Zeevi, Assaf
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
[4] Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms
Huyuk, Alihan
Tekin, Cem
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[5] Thompson Sampling for Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit
Nakamura, Shintaro
Sugiyama, Masashi
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14414 - 14421
[6] PPAR: A Privacy-Preserving Adaptive Ranking Algorithm for Multi-Armed-Bandit Crowdsourcing
Chen, Shuzhen
Yu, Dongxiao
Li, Feng
Zou, Zongrui
Liang, Weifa
Cheng, Xiuzhen
2022 IEEE/ACM 30TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2022,
[7] Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays
Komiyama, Junpei
Honda, Junya
Nakagawa, Hiroshi
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1152 - 1161
[8] UCBEE: A Multi Armed Bandit Approach for Early-Exit in Neural Networks
Pacheco, Roberto G.
Bajpai, Divya J.
Shifrin, Mark
Couto, Rodrigo S.
Menasche, Daniel Sadoc
Hanawal, Manjesh K.
Campista, Miguel Elias M.
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2025, 22 (01): : 107 - 120
[9] Differential Privacy in Social Networks Using Multi-Armed Bandit
Odeyomi, Olusola T.
IEEE ACCESS, 2022, 10 : 11817 - 11829
[10] Learning the Truth in Social Networks Using Multi-Armed Bandit
Odeyomi, Olusola T.
IEEE ACCESS, 2020, 8 : 137692 - 137701

← 1 2 3 4 5 →