Finite-Sample Bounds for Adaptive Inverse Reinforcement Learning using Passive Langevin Dynamics

被引:0
|
作者
Snow, Luke [1 ]
Krishnamurthy, Vikram [1 ]
机构
[1] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
ALGORITHMS;
D O I
10.1109/CDC49753.2023.10383223
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stochastic gradient Langevin dynamics (SGLD) are a useful methodology for sampling from probability distributions. This paper provides a finite sample analysis of a passive stochastic gradient Langevin dynamics algorithm (PSGLD) designed to achieve inverse reinforcement learning. By "passive", we mean that the noisy gradients available to the PSGLD algorithm (inverse learning process) are evaluated at randomly chosen points by an external stochastic gradient algorithm (forward learner). The PSGLD algorithm acts as a randomized sampler which recovers the cost function being optimized by this external process. Previous work has analyzed the asymptotic performance of this passive algorithm using stochastic approximation techniques; in this work we analyze the non-asymptotic performance. Specifically, we provide finite-time bounds on the 2-Wasserstein distance between the passive algorithm and its stationary measure, from which the reconstructed cost function is obtained.
引用
收藏
页码:3618 / 3625
页数:8
相关论文
共 50 条
  • [1] Langevin dynamics for adaptive inverse reinforcement learning of stochastic gradient algorithms
    Krishnamurthy, Vikram
    Yin, George
    Journal of Machine Learning Research, 2021, 22
  • [2] Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms
    Krishnamurthy, Vikram
    Yin, George
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22 : 1 - 49
  • [3] Towards finite-sample convergence of direct reinforcement learning
    Lim, SH
    DeJong, G
    MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 230 - 241
  • [4] Finite-sample Bounds for Marginal MAP
    Lou, Qi
    Dechter, Rina
    Ihler, Alexander
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 725 - 734
  • [5] Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning
    Chen, Zaiwei
    Zhang, Sheng
    Doan, Thinh T.
    Clarke, John-Paul
    Maguluri, Siva Theja
    AUTOMATICA, 2022, 146
  • [6] A Unified Lyapunov Framework for Finite-Sample Analysis of Reinforcement Learning Algorithms
    Chen Z.
    Performance Evaluation Review, 2023, 50 (03): : 12 - 15
  • [7] ON THE FINITE-SAMPLE BEHAVIOR OF ADAPTIVE ESTIMATORS
    STEIGERWALD, DG
    JOURNAL OF ECONOMETRICS, 1992, 54 (1-3) : 371 - 400
  • [8] Finite-Sample Analysis for Decentralized Batch Multiagent Reinforcement Learning With Networked Agents
    Zhang, Kaiqing
    Yang, Zhuoran
    Liu, Han
    Zhang, Tong
    Basar, Tamer
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (12) : 5925 - 5940
  • [9] Finite-Sample Regret Bound for Distributionally Robust Offline Tabular Reinforcement Learning
    Zhou, Zhengqing
    Zhou, Zhengyuan
    Bai, Qinxun
    Qiu, Linhai
    Blanchet, Jose
    Glynn, Peter
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [10] Finite-Sample Bounds on the Accuracy of Plug-In Estimators of Fisher Information
    Cao, Wei
    Dytso, Alex
    Fauss, Michael
    Poor, H. Vincent
    ENTROPY, 2021, 23 (05)