Model-Free Imitation Learning with Policy Optimization

被引:0
|
作者
Ho, Jonathan [1 ]
Gupta, Jayesh K. [1 ]
Ermon, Stefano [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In imitation learning, an agent learns how to behave in an environment with an unknown cost function by mimicking expert demonstrations. Existing imitation learning algorithms typically involve solving a sequence of planning or reinforcement learning problems. Such algorithms are therefore not directly applicable to large, high-dimensional environments, and their performance can significantly degrade if the planning problems are not solved to optimality. Under the apprenticeship learning formalism, we develop alternative model-free algorithms for finding a parameterized stochastic policy that performs at least as well as an expert policy on an unknown cost function, based on sample trajectories from the expert. Our approach, based on policy gradients, scales to large continuous environments with guaranteed convergence to local minima.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Model-free Control Design Using Policy Gradient Reinforcement Learning in LPV Framework
    Bao, Yajie
    Velni, Javad Mohammadpour
    2021 EUROPEAN CONTROL CONFERENCE (ECC), 2021, : 150 - 155
  • [32] Learning Representations in Model-Free Hierarchical Reinforcement Learning
    Rafati, Jacob
    Noelle, David C.
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10009 - 10010
  • [33] Model-Free GPU Online Energy Optimization
    Wang, Farui
    Hao, Meng
    Zhang, Weizhe
    Wang, Zheng
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2024, 9 (02): : 141 - 154
  • [34] Optimizing a Continuum Manipulator's Search Policy Through Model-Free Reinforcement Learning
    Frazelle, Chase
    Rogers, Jonathan
    Karamouzas, Ioannis
    Walker, Ian
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5564 - 5571
  • [35] Locally Weighted Least Squares Policy Iteration for Model-free Learning in Uncertain Environments
    Howard, Matthew
    Nakamura, Yoshihiko
    2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 1223 - 1229
  • [36] Prosocial learning: Model-based or model-free?
    Navidi, Parisa
    Saeedpour, Sepehr
    Ershadmanesh, Sara
    Hossein, Mostafa Miandari
    Bahrami, Bahador
    PLOS ONE, 2023, 18 (06):
  • [37] Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
    Bogdanovic, Miroslav
    Khadiv, Majid
    Righetti, Ludovic
    FRONTIERS IN ROBOTICS AND AI, 2022, 9
  • [38] MFRLMO: Model-free reinforcement learning for multi-objective optimization of apache spark
    Ozturk, Muhammed Maruf
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (05): : 1 - 15
  • [39] Model-free learning adaptive controller with neural network compensator and differential evolution optimization
    dos Santos Coelho, Leandro
    Rodrigues Coelho, Antonio Augusto
    Sumar, Rodrigo R.
    PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL, 2006, : 388 - +
  • [40] Model-Free Active Exploration in Reinforcement Learning
    Russo, Alessio
    Proutiere, Alexandre
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,