A Hessian Actor-Critic Algorithm

被引:0
|
作者
Wang, Jing [1 ]
Paschalidis, Ioannis Ch [1 ,2 ]
机构
[1] Boston Univ, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA
[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
关键词
Actor-critic algorithms; Newton's method; Markov decision processes; Autonomous robots; SENSITIVITY-ANALYSIS; POTENTIALS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.
引用
收藏
页码:1131 / 1136
页数:6
相关论文
共 50 条
  • [1] An Actor-Critic Algorithm With Second-Order Actor and Critic
    Wang, Jing
    Paschalidis, Ioannis Ch.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (06) : 2689 - 2703
  • [2] An Actor-Critic Algorithm for SVM Hyperparameters
    Kim, Chayoung
    Park, Jung-min
    Kim, Hye-young
    INFORMATION SCIENCE AND APPLICATIONS 2018, ICISA 2018, 2019, 514 : 653 - 661
  • [3] A Finite Sample Analysis of the Actor-Critic Algorithm
    Yang, Zhuoran
    Zhang, Kaiqing
    Hong, Mingyi
    Basar, Tamer
    2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 2759 - 2764
  • [4] Actor-Critic Algorithm with Transition Cost Estimation
    Sergey, Denisov
    Lee, Jee-Hyong
    INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2016, 16 (04) : 270 - 275
  • [5] The Effect of Discounting Actor-loss in Actor-Critic Algorithm
    Yaputra, Jordi
    Suyanto, Suyanto
    2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,
  • [6] A Soft Actor-Critic Algorithm for Sequential Recommendation
    Hong, Hyejin
    Kimurn, Yusuke
    Hatano, Kenji
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT I, DEXA 2024, 2024, 14910 : 258 - 266
  • [7] A modified actor-critic reinforcement learning algorithm
    Mustapha, SM
    Lachiver, G
    2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
  • [8] SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION
    Gao, Xiaonan
    Wu, Ziyi
    Zhu, Xianchao
    Cai, Lei
    JOURNAL OF NONLINEAR FUNCTIONAL ANALYSIS, 2025, 2025
  • [9] Actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014
  • [10] On actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) : 1143 - 1166