A Satisficing Strategy with Variable Reference in the Multi-armed Bandit Problems

被引：0

作者：

Kohno, Yu ^{[1
]}

Takahashi, Tatsuji ^{[2
]}

机构：

[1] Tokyo Denki Univ, Grad Sch Adv Sci & Technol, Hiki, Saitama 3500394, Japan

[2] Tokyo Denki Univ, Hiki, Saitama 3500394, Japan

来源：

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014) | 2015年 / 1648卷

关键词：

Symmetric reasoning; decision-making; N armed bandit problem; speed-accuracy trade-off;

D O I：

10.1063/1.4912815

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The loosely symmetric model (LS) is as a subjective probability model that came from human beings' cognitive characteristics. To suggest a value to apply human beings' cognitive characteristics, we developed a value function "loosely symmetric model with variable reference" (LS-aVR) that expanded LS in the decision-amaking. It is important how get a reference value having an agent from environment to determine whether an algorithm using LS-aVR explores in comparison with a reference value. In this study, we proposed using statistical knowledge in an online method to acquire a reference value. Therefore we succeeded in making the result that new method exceeded a superior existing model in the multi-aarmed banded problem that is a kind of decision-amaking problems.

引用

页数：4

共 50 条

[21] Arm Space Decomposition as a Strategy for Tackling Large Scale Multi-Armed Bandit Problems
Gupta, Neha
Granmo, Ole-Christoffer
Agrawala, Ashok
2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1, 2013, : 252 - 257
[22] A Multi-Armed Bandit Selection Strategy for Hyper-heuristics
Ferreira, Alexandre Silvestre
Goncalves, Richard Aderbal
Pozo, Aurora
2017 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2017, : 525 - 532
[23] GAUSSIAN PROCESS MODELLING OF DEPENDENCIES IN MULTI-ARMED BANDIT PROBLEMS
Dorard, Louis
Glowacka, Dorota
Shawe-Taylor, John
PROCEEDINGS OF THE 10TH INTERNATIONAL SYMPOSIUM ON OPERATIONAL RESEARCH SOR 09, 2009, : 77 - 84
[24] Time-Varying Stochastic Multi-Armed Bandit Problems
Vakili, Sattar
Zhao, Qing
Zhou, Yuan
CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2014, : 2103 - 2107
[25] Synchronization and optimality for multi-armed bandit problems in continuous time
ElKaroui, N
Karatzas, I
COMPUTATIONAL & APPLIED MATHEMATICS, 1997, 16 (02): : 117 - 151
[26] Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems
Vakili, Sattar
Liu, Keqin
Zhao, Qing
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2013, 7 (05) : 759 - 767
[27] The Effect of Communication on Noncooperative Multiplayer Multi-Armed Bandit Problems
Evirgen, Noyan
Kose, Alper
2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 331 - 336
[28] On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems
Kim, Baekjin
Tewari, Ambuj
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[29] Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
Bubeck, Sebastien
Cesa-Bianchi, Nicolo
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 5 (01): : 1 - 122
[30] Dynamic Multi-Armed Bandit with Covariates
Pavlidis, Nicos G.
Tasoulis, Dimitris K.
Adams, Niall M.
Hand, David J.
ECAI 2008, PROCEEDINGS, 2008, 178 : 777 - +

← 1 2 3 4 5 →