Consolidated actor-critic model for partially-observable Markov decision processes

被引:0
|
作者
Elhanany, I. [1 ]
Niedzwiedz, C. [1 ]
Liu, Z.
Livingston, S. [1 ]
机构
[1] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN 37996 USA
关键词
D O I
10.1049/el:20081346
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A method for consolidating the traditionally separate actor and critic neural networks in temporal difference learning for addressing partially-observable Markov decision processes (POMDPs) is presented. Simulation results for solving a five-state POMDP problem support the claim that the consolidated model achieves higher performance while reducing computational and storage requirements to approximately half those of the traditional approach.
引用
收藏
页码:1317 / U41
页数:2
相关论文
共 50 条
  • [1] Qualitative Analysis of Partially-Observable Markov Decision Processes
    Chatterjee, Krishnendu
    Doyen, Laurent
    Henzinger, Thomas A.
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2010, 2010, 6281 : 258 - 269
  • [2] An actor-critic algorithm for constrained Markov decision processes
    Borkar, VS
    SYSTEMS & CONTROL LETTERS, 2005, 54 (03) : 207 - 213
  • [3] Actor-critic algorithms for hierarchical Markov decision processes
    Bhatnagar, S
    Panigrahi, JR
    AUTOMATICA, 2006, 42 (04) : 637 - 644
  • [4] Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
    Srinivasan, Sriram
    Lanctot, Marc
    Zambaldi, Vinicius
    Perolat, Julien
    Tuyls, Karl
    Munos, Remi
    Bowling, Michael
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [5] Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes
    Haklidir, Mehmet
    Temeltas, Hakan
    IEEE ACCESS, 2021, 9 : 159672 - 159683
  • [6] Coordinating Eye-Hand Action via Partially-Observable Markov Decision Processes
    Cheng Yanyun
    Zhu Songhao
    Liang Zhiwei
    Fan Lili
    PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE, 2012, : 3969 - 3973
  • [7] Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning
    Leibfried, Felix
    Grau-Moya, Jordi
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [8] An Online Actor-Critic Algorithm with Function Approximation for Constrained Markov Decision Processes
    Bhatnagar, Shalabh
    Lakshmanan, K.
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2012, 153 (03) : 688 - 708
  • [10] A partially-observable markov decision process for dealing with dynamically changing environments
    Chatzis, Sotirios P.
    Kosmopoulos, Dimitrios
    IFIP Advances in Information and Communication Technology, 2014, 436 : 111 - 120