A Hessian Actor-Critic Algorithm

被引:0
|
作者
Wang, Jing [1 ]
Paschalidis, Ioannis Ch [1 ,2 ]
机构
[1] Boston Univ, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA
[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
关键词
Actor-critic algorithms; Newton's method; Markov decision processes; Autonomous robots; SENSITIVITY-ANALYSIS; POTENTIALS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.
引用
收藏
页码:1131 / 1136
页数:6
相关论文
共 50 条
  • [31] A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification
    Li, Shuang
    Yan, Yanghui
    Ren, Ju
    Zhou, Yuezhi
    Zhang, Yaoxue
    CHINESE JOURNAL OF ELECTRONICS, 2020, 29 (01) : 89 - 96
  • [32] THE ACTOR-CRITIC ALGORITHM FOR INFINITE HORIZON DISCOUNTED COST REVISITED
    Gosavi, Abhijit
    2020 WINTER SIMULATION CONFERENCE (WSC), 2020, : 2867 - 2878
  • [33] A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification
    LI Shuang
    YAN Yanghui
    REN Ju
    ZHOU Yuezhi
    ZHANG Yaoxue
    ChineseJournalofElectronics, 2020, 29 (01) : 89 - 96
  • [34] Actor-critic algorithm with incremental dual natural policy gradient
    Zhang P.
    Liu Q.
    Zhong S.
    Zhai J.-W.
    Qian W.-S.
    2017, Editorial Board of Journal on Communications (38): : 166 - 177
  • [35] Decentralized Multiagent Actor-Critic Algorithm Based on Message Diffusion
    Ding, Siyuan
    Li, Shengxiang
    Liu, Guangyi
    Li, Ou
    Ke, Ke
    Bai, Yijie
    Chen, Weiye
    JOURNAL OF SENSORS, 2021, 2021
  • [36] An Experience-Guided Deep Deterministic Actor-Critic Algorithm with Multi-Actor
    Chen H.
    Liu Q.
    Yan Y.
    He B.
    Jiang Y.
    Zhang L.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (08): : 1708 - 1720
  • [37] Importance sampling actor-critic algorithms
    Williams, Jason L.
    Fisher, John W., III
    Willsky, Alan S.
    2006 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2006, 1-12 : 1625 - +
  • [38] A Novel Heterogeneous Actor-critic Algorithm with Recent Emphasizing Replay Memory
    Bao Xi
    Rui Wang
    Ying-Hao Cai
    Tao Lu
    Shuo Wang
    International Journal of Automation and Computing, 2021, 18 : 619 - 631
  • [39] Reduce UAV Coverage Energy Consumption through Actor-Critic Algorithm
    Liu, Bo
    Zhang, Yue
    Fu, Shupo
    Liu, Xuan
    2019 15TH INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SENSOR NETWORKS (MSN 2019), 2019, : 332 - 337
  • [40] Better Exploration with Optimistic Actor-Critic
    Ciosek, Kamil
    Quan Vuong
    Loftin, Robert
    Hofmann, Katja
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32