A Hessian Actor-Critic Algorithm

被引:0
|
作者
Wang, Jing [1 ]
Paschalidis, Ioannis Ch [1 ,2 ]
机构
[1] Boston Univ, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA
[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
关键词
Actor-critic algorithms; Newton's method; Markov decision processes; Autonomous robots; SENSITIVITY-ANALYSIS; POTENTIALS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.
引用
收藏
页码:1131 / 1136
页数:6
相关论文
共 50 条
  • [21] Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
    Zhong, Shan
    Liu, Quan
    Fu, QiMing
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
  • [22] A connectionist actor-critic algorithm for faster learning and biological plausibility
    Johard, Leonard
    Ruffaldi, Emanuele
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 3903 - 3909
  • [23] Natural actor-critic algorithms
    Bhatnagar, Shalabh
    Sutton, Richard S.
    Ghavamzadeh, Mohammad
    Lee, Mark
    AUTOMATICA, 2009, 45 (11) : 2471 - 2482
  • [24] Actor-Critic Instance Segmentation
    Araslanov, Nikita
    Rothkopf, Constantin A.
    Roth, Stefan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8229 - 8238
  • [25] Procurement auctions using actor-critic type learning algorithm
    Raju, CVL
    Narahari, Y
    Shah, S
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4588 - 4594
  • [26] An accelerated asynchronous advantage actor-critic algorithm applied in papermaking
    Wang, Xuechun
    Zhuang, Zhiwei
    Zou, Luobao
    Zhang, Weidong
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8637 - 8642
  • [27] Efficient Actor-critic Algorithm with Dual Piecewise Model Learning
    Zhong, Shan
    Liu, Quan
    Gong, Shengrong
    Fu, Qiming
    Xu, Jin
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 823 - 830
  • [28] Actor-Critic or Critic-Actor? A Tale of Two Time Scales
    Bhatnagar, Shalabh
    Borkar, Vivek S.
    Guin, Soumyajit
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 2671 - 2676
  • [29] Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay
    Tasfi, Norman
    Capretz, Miriam
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [30] Evaluating Correctness of Reinforcement Learning based on Actor-Critic Algorithm
    Kim, Youngjae
    Hussain, Manzoor
    Suh, Jae-Won
    Hong, Jang-Eui
    2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 320 - 325