Consolidated actor-critic model for partially-observable Markov decision processes

被引：0

作者：

Elhanany, I. ^{[1
]}

Niedzwiedz, C. ^{[1
]}

Liu, Z.

Livingston, S. ^{[1
]}

机构：

[1] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN 37996 USA

来源：

ELECTRONICS LETTERS | 2008年 / 44卷 / 22期

关键词：

D O I：

10.1049/el:20081346

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A method for consolidating the traditionally separate actor and critic neural networks in temporal difference learning for addressing partially-observable Markov decision processes (POMDPs) is presented. Simulation results for solving a five-state POMDP problem support the claim that the consolidated model achieves higher performance while reducing computational and storage requirements to approximately half those of the traditional approach.

引用

页码：1317 / U41

页数：2

共 50 条

[1] Qualitative Analysis of Partially-Observable Markov Decision Processes
Chatterjee, Krishnendu
Doyen, Laurent
Henzinger, Thomas A.
MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2010, 2010, 6281 : 258 - 269
[2] An actor-critic algorithm for constrained Markov decision processes
Borkar, VS
SYSTEMS & CONTROL LETTERS, 2005, 54 (03) : 207 - 213
[3] Actor-critic algorithms for hierarchical Markov decision processes
Bhatnagar, S
Panigrahi, JR
AUTOMATICA, 2006, 42 (04) : 637 - 644
[4] Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
Srinivasan, Sriram
Lanctot, Marc
Zambaldi, Vinicius
Perolat, Julien
Tuyls, Karl
Munos, Remi
Bowling, Michael
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[5] Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes
Haklidir, Mehmet
Temeltas, Hakan
IEEE ACCESS, 2021, 9 : 159672 - 159683
[6] Coordinating Eye-Hand Action via Partially-Observable Markov Decision Processes
Cheng Yanyun
Zhu Songhao
Liang Zhiwei
Fan Lili
PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE, 2012, : 3969 - 3973
[7] Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning
Leibfried, Felix
Grau-Moya, Jordi
CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[8] An Online Actor-Critic Algorithm with Function Approximation for Constrained Markov Decision Processes
Bhatnagar, Shalabh
Lakshmanan, K.
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2012, 153 (03) : 688 - 708
[9] An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
Bhatnagar, Shalabh
SYSTEMS & CONTROL LETTERS, 2010, 59 (12) : 760 - 766
[10] A partially-observable markov decision process for dealing with dynamically changing environments
Chatzis, Sotirios P.
Kosmopoulos, Dimitrios
IFIP Advances in Information and Communication Technology, 2014, 436 : 111 - 120

← 1 2 3 4 5 →