A Dynamic Programming Algorithm for Decentralized Markov Decision Processes with a Broadcast Structure

被引:13
|
作者
Wu, Jeff [1 ]
Lall, Sanjay [1 ]
机构
[1] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
关键词
COMPLEXITY;
D O I
10.1109/CDC.2010.5718187
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We give an optimal dynamic programming algorithm to solve a class of finite-horizon decentralized Markov decision processes (MDPs). We consider problems with a broadcast information structure that consists of a central node that only has access to its own state but can affect several outer nodes, while each outer node has access to both its own state and the central node's state, but cannot affect the other nodes. The solution to this problem involves a dynamic program similar to that of a centralized partially-observed Markov decision process.
引用
收藏
页码:6143 / 6148
页数:6
相关论文
共 50 条
  • [41] Learning and optimal control of imprecise Markov decision processes by dynamic programming using the imprecise Dirichlet model
    Troffaes, MCM
    SOFT METHODOLOGY AND RANDOM INFORMATION SYSTEMS, 2004, : 141 - 148
  • [42] Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes
    Lee, Donghwan
    Lim, Han-Dong
    Kim, Do Wan
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024, 2024, : 960 - 967
  • [43] A performance gradient perspective on approximate dynamic programming and its application to partially observable markov decision processes
    Dankert, James
    Yang, Lei
    Si, Jennie
    PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL, 2006, : 87 - +
  • [44] Topological Value Iteration Algorithm for Markov Decision Processes
    Dai, Peng
    Goldsmith, Judy
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1860 - 1865
  • [45] SINGULARLY PERTURBED MARKOV DECISION PROCESSES: A MULTIRESOLUTION ALGORITHM
    Ho, Chin Pang
    Parpas, Panos
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2014, 52 (06) : 3854 - 3886
  • [46] A reinforcement learning based algorithm for Markov decision processes
    Bhatnagar, S
    Kumar, S
    2005 International Conference on Intelligent Sensing and Information Processing, Proceedings, 2005, : 199 - 204
  • [47] An adaptive sampling algorithm for solving Markov decision processes
    Chang, HS
    Fu, MC
    Hu, JQ
    Marcus, SI
    OPERATIONS RESEARCH, 2005, 53 (01) : 126 - 139
  • [49] Dynamic workflow composition using Markov decision processes
    Doshi, P
    Goodwin, R
    Akkiraju, R
    Verma, K
    IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, PROCEEDINGS, 2004, : 576 - 582
  • [50] DYNAMIC-PROGRAMMING RECURSIONS FOR MULTIPLICATIVE MARKOV DECISION CHAINS
    SLADKY, K
    MATHEMATICAL PROGRAMMING STUDY, 1976, 6 (DEC): : 216 - 226