Model-Free Imitation Learning with Policy Optimization

被引:0
|
作者
Ho, Jonathan [1 ]
Gupta, Jayesh K. [1 ]
Ermon, Stefano [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In imitation learning, an agent learns how to behave in an environment with an unknown cost function by mimicking expert demonstrations. Existing imitation learning algorithms typically involve solving a sequence of planning or reinforcement learning problems. Such algorithms are therefore not directly applicable to large, high-dimensional environments, and their performance can significantly degrade if the planning problems are not solved to optimality. Under the apprenticeship learning formalism, we develop alternative model-free algorithms for finding a parameterized stochastic policy that performs at least as well as an expert policy on an unknown cost function, based on sample trajectories from the expert. Our approach, based on policy gradients, scales to large continuous environments with guaranteed convergence to local minima.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Model-Free Quantum Control with Reinforcement Learning
    Sivak, V. V.
    Eickbusch, A.
    Liu, H.
    Royer, B.
    Tsioutsios, I
    Devoret, M. H.
    PHYSICAL REVIEW X, 2022, 12 (01)
  • [42] Model-Free Guidance Method for Drones in Complex Environments Using Direct Policy Exploration and Optimization
    Liu, Hongxun
    Suzuki, Satoshi
    DRONES, 2023, 7 (08)
  • [43] Model-free learning control for unstable system
    Ribeiro, CHC
    Hemerly, EM
    ELECTRONICS LETTERS, 1998, 34 (21) : 2070 - 2071
  • [44] Model-Free Reinforcement Learning Algorithms: A Survey
    Calisir, Sinan
    Pehlivanoglu, Meltem Kurt
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [45] Recovering Robustness in Model-Free Reinforcement Learning
    Venkataraman, Harish K.
    Seiler, Peter J.
    2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 4210 - 4216
  • [46] Online Nonstochastic Model-Free Reinforcement Learning
    Ghai, Udaya
    Gupta, Arushi
    Xia, Wenhan
    Singh, Karan
    Hazan, Elad
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [47] Model-free learning of wire winding control
    Rodriguez, Abdel
    Vrancx, Peter
    Nowe, Ann
    Hostens, Erik
    2013 9TH ASIAN CONTROL CONFERENCE (ASCC), 2013,
  • [48] Model-Free Approximate Bayesian Learning for Large-Scale Conversion Funnel Optimization
    Iyengar, Garud
    Singal, Raghav
    PRODUCTION AND OPERATIONS MANAGEMENT, 2024, 33 (03) : 775 - 794
  • [49] Model-Free Learning and Optimal Policy Design in Multiagent MDPs Under Probabilistic Agent Dropout
    Fiscko, Carmel
    Kar, Soummya
    Sinopoli, Bruno
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2025, 12 (01): : 361 - 373
  • [50] Learn Zero-Constraint-Violation Safe Policy in Model-Free Constrained Reinforcement Learning
    Ma, Haitong
    Liu, Changliu
    Li, Shengbo Eben
    Zheng, Sifa
    Sun, Wenchao
    Chen, Jianyu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2327 - 2341