A Dynamic Programming Algorithm for Decentralized Markov Decision Processes with a Broadcast Structure

被引：13

作者：

Wu, Jeff ^{[1
]}

Lall, Sanjay ^{[1
]}

机构：

[1] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA

来源：

49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC) | 2010年

关键词：

COMPLEXITY;

D O I：

10.1109/CDC.2010.5718187

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We give an optimal dynamic programming algorithm to solve a class of finite-horizon decentralized Markov decision processes (MDPs). We consider problems with a broadcast information structure that consists of a central node that only has access to its own state but can affect several outer nodes, while each outer node has access to both its own state and the central node's state, but cannot affect the other nodes. The solution to this problem involves a dynamic program similar to that of a centralized partially-observed Markov decision process.

引用

页码：6143 / 6148

页数：6

共 50 条

[41] Learning and optimal control of imprecise Markov decision processes by dynamic programming using the imprecise Dirichlet model
Troffaes, MCM
SOFT METHODOLOGY AND RANDOM INFORMATION SYSTEMS, 2004, : 141 - 148
[42] Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes
Lee, Donghwan
Lim, Han-Dong
Kim, Do Wan
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024, 2024, : 960 - 967
[43] A performance gradient perspective on approximate dynamic programming and its application to partially observable markov decision processes
Dankert, James
Yang, Lei
Si, Jennie
PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL, 2006, : 87 - +
[44] Topological Value Iteration Algorithm for Markov Decision Processes
Dai, Peng
Goldsmith, Judy
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1860 - 1865
[45] SINGULARLY PERTURBED MARKOV DECISION PROCESSES: A MULTIRESOLUTION ALGORITHM
Ho, Chin Pang
Parpas, Panos
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2014, 52 (06) : 3854 - 3886
[46] A reinforcement learning based algorithm for Markov decision processes
Bhatnagar, S
Kumar, S
2005 International Conference on Intelligent Sensing and Information Processing, Proceedings, 2005, : 199 - 204
[47] An adaptive sampling algorithm for solving Markov decision processes
Chang, HS
Fu, MC
Hu, JQ
Marcus, SI
OPERATIONS RESEARCH, 2005, 53 (01) : 126 - 139
[48] Algorithm of discounted model of partially observable Markov decision programming
Hunan Daxue Xuebao, 5 (16):
[49] Dynamic workflow composition using Markov decision processes
Doshi, P
Goodwin, R
Akkiraju, R
Verma, K
IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, PROCEEDINGS, 2004, : 576 - 582
[50] DYNAMIC-PROGRAMMING RECURSIONS FOR MULTIPLICATIVE MARKOV DECISION CHAINS
SLADKY, K
MATHEMATICAL PROGRAMMING STUDY, 1976, 6 (DEC): : 216 - 226

← 1 2 3 4 5 →