A Hessian Actor-Critic Algorithm

被引：0

作者：

Wang, Jing ^{[1
]}

Paschalidis, Ioannis Ch ^{[1
,2
]}

机构：

[1] Boston Univ, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA

[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA

来源：

2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC) | 2014年

关键词：

Actor-critic algorithms; Newton's method; Markov decision processes; Autonomous robots; SENSITIVITY-ANALYSIS; POTENTIALS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.

引用

页码：1131 / 1136

页数：6

共 50 条

[41] Twin Delayed Hierarchical Actor-Critic
Anca, Mihai
Studley, Matthew
2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 221 - 225
[42] Generative Adversarial Soft Actor-Critic
Hwang, Hyo-Seok
Kim, Yoojoong
Seok, Junhee
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[43] Robust Actor-Critic With Relative Entropy Regulating Actor
Cheng, Yuhu
Huang, Longyang
Chen, C. L. Philip
Wang, Xuesong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 9054 - 9063
[44] AN ACTOR-CRITIC REINFORCEMENT LEARNING ALGORITHM BASED ON ADAPTIVE RBF NETWORK
Li, Chun-Gui
Wang, Meng
Huang, Zhen-Jin
Zhang, Zeng-Fang
PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 984 - 988
[45] The actor-critic algorithm as multi-time-scale stochastic approximation
Vivek S Borkar
Vijaymohan R Konda
Sadhana, 1997, 22 : 525 - 543
[46] A New Advantage Actor-Critic Algorithm For Multi-Agent Environments
Paczolay, Gabor
Harmati, Istvan
2020 23RD IEEE INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR), 2020,
[47] A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
Borkar, VS
SYSTEMS & CONTROL LETTERS, 2001, 44 (05) : 339 - 346
[48] An Online Actor-Critic Learning Approach with Levenberg-Marquardt Algorithm
Ni, Zhen
He, Haibo
Prokhorov, Danil V.
Fu, Jian
2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2333 - 2340
[49] A stable actor-critic algorithm for solving robotic tasks with multiple constraints
ZHAO Peiyao
ZHU Fei
LIU Quan
LING Xinghong
Frontiers of Computer Science, 2023, 17 (04)
[50] A stable actor-critic algorithm for solving robotic tasks with multiple constraints
Zhao, Peiyao
Zhu, Fei
Liu, Quan
Ling, Xinghong
FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (04)

← 1 2 3 4 5 →