A Hessian Actor-Critic Algorithm

被引:0
|
作者
Wang, Jing [1 ]
Paschalidis, Ioannis Ch [1 ,2 ]
机构
[1] Boston Univ, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA
[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
关键词
Actor-critic algorithms; Newton's method; Markov decision processes; Autonomous robots; SENSITIVITY-ANALYSIS; POTENTIALS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.
引用
收藏
页码:1131 / 1136
页数:6
相关论文
共 50 条
  • [41] Twin Delayed Hierarchical Actor-Critic
    Anca, Mihai
    Studley, Matthew
    2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 221 - 225
  • [42] Generative Adversarial Soft Actor-Critic
    Hwang, Hyo-Seok
    Kim, Yoojoong
    Seok, Junhee
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [43] Robust Actor-Critic With Relative Entropy Regulating Actor
    Cheng, Yuhu
    Huang, Longyang
    Chen, C. L. Philip
    Wang, Xuesong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 9054 - 9063
  • [44] AN ACTOR-CRITIC REINFORCEMENT LEARNING ALGORITHM BASED ON ADAPTIVE RBF NETWORK
    Li, Chun-Gui
    Wang, Meng
    Huang, Zhen-Jin
    Zhang, Zeng-Fang
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 984 - 988
  • [45] The actor-critic algorithm as multi-time-scale stochastic approximation
    Vivek S Borkar
    Vijaymohan R Konda
    Sadhana, 1997, 22 : 525 - 543
  • [46] A New Advantage Actor-Critic Algorithm For Multi-Agent Environments
    Paczolay, Gabor
    Harmati, Istvan
    2020 23RD IEEE INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR), 2020,
  • [47] A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
    Borkar, VS
    SYSTEMS & CONTROL LETTERS, 2001, 44 (05) : 339 - 346
  • [48] An Online Actor-Critic Learning Approach with Levenberg-Marquardt Algorithm
    Ni, Zhen
    He, Haibo
    Prokhorov, Danil V.
    Fu, Jian
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2333 - 2340
  • [49] A stable actor-critic algorithm for solving robotic tasks with multiple constraints
    ZHAO Peiyao
    ZHU Fei
    LIU Quan
    LING Xinghong
    Frontiers of Computer Science, 2023, 17 (04)
  • [50] A stable actor-critic algorithm for solving robotic tasks with multiple constraints
    Zhao, Peiyao
    Zhu, Fei
    Liu, Quan
    Ling, Xinghong
    FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (04)