Approximation of Expected Reward Value in MMDP

被引:0
|
作者
Hanna, Hosam [1 ]
Yao, Jin [1 ]
Zreik, Khaldoun [2 ]
机构
[1] Univ Caen, GREYC, Dept Comp Sci, F-14032 Caen, France
[2] Univ Paris 08, Paragraphe Lab, F-93526 St Denis 02, France
来源
2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5 | 2008年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Among researchers in multi-agent systems, there has been growing interest in a coordination problem, particularly when agents' behaviors are stochastic. A multiagent Markov Decision Process MMDP is an efficient way to obtain an optimal suite of decisions that all agents have to take. But, a hard computation is required to solve it. Proposed methods to solve an MMDP depend on the fact that each agent has precise knowledge about the behaviors of the others. In this paper, we consider a fully cooperative multi-agent system where agents have to coordinate their uncertain behaviors. In this system, an agent can partially observe the state of the others. We present a method allowing agents to construct and to solve an MMDP by exchanging the expected reward value of some states. For large systems, we present a model to approximate the expected reward value using the distributed MDPs.
引用
收藏
页码:1372 / +
页数:2
相关论文
共 50 条
  • [1] Expected Reward Value and Reward Uncertainty Have Temporally Dissociable Effects on Memory Formation
    Stanek, Jessica K.
    Dickerson, Kathryn C.
    Chiew, Kimberly S.
    Clement, Nathaniel J.
    Adcock, R. Alison
    JOURNAL OF COGNITIVE NEUROSCIENCE, 2019, 31 (10) : 1443 - 1454
  • [2] Two-stage approximation of expected reward for gamma random variables
    Liu, JF
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2001, 30 (07) : 1471 - 1480
  • [3] Oculomotor capture is influenced by expected reward value but (maybe) not predictiveness
    Le Pelley, Mike E.
    Pearson, Daniel
    Porter, Alexis
    Yee, Hannah
    Luque, David
    QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2019, 72 (02): : 168 - 181
  • [4] Blunted Expected Reward Value Signals in Binge Alcohol Drinkers
    Tolomeo, Serenella
    Baldacchino, Alex
    Steele, J. Douglas
    JOURNAL OF NEUROSCIENCE, 2023, 43 (31): : 5685 - 5692
  • [5] Expected utility and reward value representations in a probabilistic decision task
    McCabe, C.
    Rolls, E.
    Redoute, J.
    JOURNAL OF PSYCHOPHYSIOLOGY, 2006, 20 (04) : 328 - 328
  • [6] Reward Uncertainty and Expected Value Enhance Generalization of Episodic Memory
    Yue, Yang
    Jiang, Yingjie
    Zhou, Fan
    Jiang, Yuantao
    Long, Yiting
    Wang, Kaiyu
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (21)
  • [7] Negative Symptoms and the Failure to Represent the Expected Reward Value of Actions
    Gold, James M.
    Waltz, James A.
    Matveeva, Tatyana M.
    Kasanova, Zuzana
    Strauss, Gregory P.
    Herbener, Ellen S.
    Collins, Anne G. E.
    Frank, Michael J.
    ARCHIVES OF GENERAL PSYCHIATRY, 2012, 69 (02) : 129 - 138
  • [8] Expected Value of Reward Predicts Episodic Memory for Incidentally Learnt Reward-Item Associations
    Mason, Alice
    Lorimer, Amy
    Farrell, Simon
    COLLABRA-PSYCHOLOGY, 2019, 5 (01)
  • [9] Approximation of the expected value of the harmonic mean and some applications
    Rao, Calyampudi Radhakrishna
    Shi, Xiaoping
    Wu, Yuehua
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2014, 111 (44) : 15681 - 15686
  • [10] Estimating the Maximum Expected Value through Gaussian Approximation
    D'Eramo, Carlo
    Nuara, Alessandro
    Restelli, Marcello
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48