A Hessian Actor-Critic Algorithm

被引：0

作者：

Wang, Jing ^{[1
]}

Paschalidis, Ioannis Ch ^{[1
,2
]}

机构：

[1] Boston Univ, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA

[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA

来源：

2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC) | 2014年

关键词：

Actor-critic algorithms; Newton's method; Markov decision processes; Autonomous robots; SENSITIVITY-ANALYSIS; POTENTIALS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.

引用

页码：1131 / 1136

页数：6

共 50 条

[31] A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification
Li, Shuang
Yan, Yanghui
Ren, Ju
Zhou, Yuezhi
Zhang, Yaoxue
CHINESE JOURNAL OF ELECTRONICS, 2020, 29 (01) : 89 - 96
[32] THE ACTOR-CRITIC ALGORITHM FOR INFINITE HORIZON DISCOUNTED COST REVISITED
Gosavi, Abhijit
2020 WINTER SIMULATION CONFERENCE (WSC), 2020, : 2867 - 2878
[33] A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification
LI Shuang
YAN Yanghui
REN Ju
ZHOU Yuezhi
ZHANG Yaoxue
ChineseJournalofElectronics, 2020, 29 (01) : 89 - 96
[34] Actor-critic algorithm with incremental dual natural policy gradient
Zhang P.
Liu Q.
Zhong S.
Zhai J.-W.
Qian W.-S.
2017, Editorial Board of Journal on Communications (38): : 166 - 177
[35] Decentralized Multiagent Actor-Critic Algorithm Based on Message Diffusion
Ding, Siyuan
Li, Shengxiang
Liu, Guangyi
Li, Ou
Ke, Ke
Bai, Yijie
Chen, Weiye
JOURNAL OF SENSORS, 2021, 2021
[36] An Experience-Guided Deep Deterministic Actor-Critic Algorithm with Multi-Actor
Chen H.
Liu Q.
Yan Y.
He B.
Jiang Y.
Zhang L.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (08): : 1708 - 1720
[37] Importance sampling actor-critic algorithms
Williams, Jason L.
Fisher, John W., III
Willsky, Alan S.
2006 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2006, 1-12 : 1625 - +
[38] A Novel Heterogeneous Actor-critic Algorithm with Recent Emphasizing Replay Memory
Bao Xi
Rui Wang
Ying-Hao Cai
Tao Lu
Shuo Wang
International Journal of Automation and Computing, 2021, 18 : 619 - 631
[39] Reduce UAV Coverage Energy Consumption through Actor-Critic Algorithm
Liu, Bo
Zhang, Yue
Fu, Shupo
Liu, Xuan
2019 15TH INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SENSOR NETWORKS (MSN 2019), 2019, : 332 - 337
[40] Better Exploration with Optimistic Actor-Critic
Ciosek, Kamil
Quan Vuong
Loftin, Robert
Hofmann, Katja
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32

← 1 2 3 4 5 →