Dynamic non-Bayesian decision making

被引:5
|
作者
Monderer, D
Tennenholtz, M
机构
关键词
D O I
10.1613/jair.447
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The model of a non-Bayesian agent who faces a repeated game with incomplete information against Nature is an appropriate tool for modeling general agent-environment interactions. In such a model the environment state (controlled by Nature) may change arbitrarily, and the feedback/reward function is initially unknown. The agent is not Bayesian, that is he does not form a prior probability neither on the state selection strategy of Nature, nor on his reward function. A policy for the agent is a function which assigns an action to every history of observations and actions. Two basic feedback structures are considered. In one of them - the perfect monitoring case - the agent is able to observe the previous environment state as part of his feedback, while in the other - the imperfect monitoring case - all that is available to the agent is the reward obtained. Both of these settings refer to partially observable processes, where the current environment state is unknown. Our main result refers to the competitive ratio criterion in the perfect monitoring case. We prove the existence of an efficient stochastic policy that ensures that the competitive ratio is obtained at almost all stages with an arbitrarily high probability, where efficiency is measured in terms of rate of convergence. It is further shown that such an optimal policy does not exist in the imperfect monitoring case. Moreover, it is proved that in the perfect monitoring case there does not exist a deterministic policy that satisfies our long run optimality criterion. In addition, we discuss the maxmin criterion and prove that a deterministic efficient optimal strategy does exist in the imperfect monitoring case under this criterion. Finally we show that our approach to long-run optimality can be viewed as qualitative, which distinguishes it from previous work in this area.
引用
收藏
页码:231 / 248
页数:18
相关论文
共 50 条
  • [31] Non-Bayesian updating: a theoretical framework
    Epstein, Larry G.
    Noor, Jawwad
    Sandroni, Alvaro
    THEORETICAL ECONOMICS, 2008, 3 (02) : 193 - 229
  • [32] An axiomatic model of non-Bayesian updating
    Epstein, LG
    REVIEW OF ECONOMIC STUDIES, 2006, 73 (02): : 413 - 436
  • [33] Adaptive Non-Bayesian State Estimation
    Ansari, Ahmad
    Bernstein, Dennis S.
    2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 6977 - 6982
  • [34] Evolutionary justifications for non-Bayesian beliefs
    Zhang, Hanzhe
    ECONOMICS LETTERS, 2013, 121 (02) : 198 - 201
  • [35] Spatial estimation: a non-Bayesian alternative
    Barth, Hilary
    Lesser, Ellen
    Taggart, Jessica
    Slusser, Emily
    DEVELOPMENTAL SCIENCE, 2015, 18 (05) : 853 - 862
  • [36] Non-Bayesian testing of a stochastic prediction
    Dekel, Eddie
    Feinberg, Yossi
    REVIEW OF ECONOMIC STUDIES, 2006, 73 (04): : 893 - 906
  • [37] WHY MOST STATISTICIANS ARE NON-BAYESIAN
    FU, JC
    AMERICAN STATISTICIAN, 1986, 40 (04): : 330 - 330
  • [38] Robust Non-Bayesian Social Learning
    Arieli, Itai
    Babichenko, Yakov
    Shlomov, Segev
    ACM EC '19: PROCEEDINGS OF THE 2019 ACM CONFERENCE ON ECONOMICS AND COMPUTATION, 2019, : 549 - 550
  • [39] BAYESIAN AND NON-BAYESIAN TESTS OF INDEPENDENCE IN SEEMINGLY UNRELATED REGRESSIONS
    SHIBA, T
    TSURUMI, H
    INTERNATIONAL ECONOMIC REVIEW, 1988, 29 (02) : 377 - 395
  • [40] Implementation of Dynamic Bayesian Decision Making by Intracellular Kinetics
    Kobayashi, Tetsuya J.
    PHYSICAL REVIEW LETTERS, 2010, 104 (22)