On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

被引:0
|
作者
Zhang, Junyu [1 ]
Ni, Chengzhuo [2 ]
Yu, Zheng [2 ]
Szepesvari, Csaba [3 ]
Wang, Mengdi [2 ]
机构
[1] Natl Univ Singapore, Dept Ind Syst Engn & Management, Singapore 119077, Singapore
[2] Princeton Univ, Dept Elect & Comp Engn, Princeton, NJ 08544 USA
[3] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8, Canada
关键词
ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy gradient (PG) gives rise to a rich class of reinforcement learning (RL) methods. Recently, there has been an emerging trend to accelerate the existing PG methods such as REINFORCE by the variance reduction techniques. However, all existing variance-reduced PG methods heavily rely on an uncheckable importance weight assumption made for every single iteration of the algorithms. In this paper, a simple gradient truncation mechanism is proposed to address this issue. Moreover, we design a Truncated Stochastic Incremental Variance-Reduced Policy Gradient (TSIVR-PG) method, which is able to maximize not only a cumulative sum of rewards but also a general utility function over a policy's long-term visiting distribution. We show an (O) over tilde (epsilon(-3)) sample complexity for TSIVR-PG to find an epsilon-stationary policy. By assuming the overparameterization of policy and exploiting the hidden convexity of the problem, we further show that TSIVR-PG converges to global epsilon-optimal policy with (O) over tilde (epsilon(-2)) samples.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Riemannian Stochastic Variance-Reduced Cubic Regularized Newton Method for Submanifold Optimization
    Zhang, Dewei
    Tajbakhsh, Sam Davanloo
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2023, 196 (01) : 324 - 361
  • [42] Riemannian Stochastic Variance-Reduced Cubic Regularized Newton Method for Submanifold Optimization
    Dewei Zhang
    Sam Davanloo Tajbakhsh
    Journal of Optimization Theory and Applications, 2023, 196 : 324 - 361
  • [43] Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization
    Wang, Zhe
    Zhou, Yi
    Liang, Yingbin
    Lan, Guanghui
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [44] MURANA: A Generic Framework for Stochastic Variance-Reduced Optimization
    Condat, Laurent
    Richtarik, Peter
    MATHEMATICAL AND SCIENTIFIC MACHINE LEARNING, VOL 190, 2022, 190
  • [45] Improved Sample Complexity for Stochastic Compositional Variance Reduced Gradient
    Lin, Tianyi
    Fan, Chengyou
    Wang, Mengdi
    Jordan, Michael I.
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 126 - 131
  • [46] Stochastic Variance-Reduced Cubic Regularized Newton Methods
    Zhou, Dongruo
    Xu, Pan
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [47] Variance-reduced particle methods for solving the Boltzmann equation
    Baker, Lowell L.
    Hadjiconstantinou, Nicolas G.
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2008, 5 (02) : 165 - 174
  • [48] Stochastic Recursive Variance-Reduced Cubic Regularization Methods
    Zhou, Dongruo
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3980 - 3989
  • [49] Momentum-based variance-reduced stochastic Bregman proximal gradient methods for nonconvex nonsmooth optimization
    Liao, Shichen
    Liu, Yan
    Han, Congying
    Guo, Tiande
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 266
  • [50] VARIANCE-REDUCED SIMULATION OF MULTISCALE TUMOR GROWTH MODELING
    Lejon, Annelies
    Mortier, Bert
    Samaey, Giovanni
    MULTISCALE MODELING & SIMULATION, 2017, 15 (01): : 388 - 409