Learning From Demonstrations: A Computationally Efficient Inverse Reinforcement Learning Approach With Simplified Implementation

被引:0
|
作者
Lin, Yanbin [1 ]
Ni, Zhen [1 ]
Zhong, Xiangnan [1 ]
机构
[1] Florida Atlantic Univ, Dept Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
基金
美国国家科学基金会;
关键词
Trajectory; Reinforcement learning; Training; Optimization; Heuristic algorithms; Approximation algorithms; Markov decision processes; Iterative methods; Imitation learning; Computational modeling; Actor critic methods; inverse reinforcement learning; neural networks; reward recovering; featurization network; and online optimization;
D O I
10.1109/TETCI.2025.3526502
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) research usually requires a reward function from sophisticated domain knowledge to perform well. Inverse reinforcement learning (IRL) methods provide the opportunity to recover such reward functions from existing expert demonstrations. However, current IRL methods require intensive computation and excessive memory storage when the state space increases. To this end, we propose a computationally efficient inverse reinforcement learning (e-IRL) approach to 1) simplify the gradient algorithm for the reward network; 2) implement the loss function with feature expectation instead of state visiting frequency; and 3) enable fast, accurate, and automatic feature processing. Specifically, we design a new featurization network to accommodate trajectories and automatically output the aligned feature vectors without human intervention. In addition, this proposed approach derives simplified algorithm formulas to obtain a gradient of loss function to update weights for the reward network. This approach eliminates repeated calculation and excessive storage of gradient parameters from demonstrated trajectories in literature. The feature expectation computation for both the expert (one-shot) and learner (streamlined) is employed to efficiently obtain the important gradient parameters. Three examples are considered to validate the effectiveness of the proposed method, which shows better performance in terms of average return, required average number of steps, and required number of demonstrations than comparable methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An Efficient Unified Approach Using Demonstrations for Inverse Reinforcement Learning
    Hwang, Maxwell
    Jiang, Wei-Cheng
    Chen, Yu-Jen
    Hwang, Kao-Shing
    Tseng, Yi-Chia
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (03) : 444 - 452
  • [2] An Unified Approach to Inverse Reinforcement Learning by Oppositive Demonstrations
    Hwang, Kao-Shing
    Jiang, Wei-Cheng
    Tseng, Yi-Chia
    PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 1664 - 1668
  • [3] Learning Fairness from Demonstrations via Inverse Reinforcement Learning
    Blandin, Jack
    Kash, Ian
    PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024, 2024, : 51 - 61
  • [4] Inverse Reinforcement Learning of Interaction Dynamics from Demonstrations
    Hussein, Mostafa
    Begum, Momotaz
    Petrik, Marek
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 2267 - 2274
  • [5] Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
    Scheller, Christian
    Schraner, Yanick
    Vogel, Manfred
    NEURIPS 2019 COMPETITION AND DEMONSTRATION TRACK, VOL 123, 2019, 123 : 67 - 76
  • [6] Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations
    Chen, Letian
    Jayanthi, Sravan
    Paleja, Rohan
    Martin, Daniel
    Zakharov, Viacheslav
    Gombolay, Matthew
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 2083 - 2094
  • [7] Analysis of Inverse Reinforcement Learning with Perturbed Demonstrations
    Melo, Francisco S.
    Lopes, Manuel
    Ferreira, Ricardo
    ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 349 - 354
  • [8] Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
    Mourad, Nafee
    Ezzeddine, Ali
    Nadjar Araabi, Babak
    Nili Ahmadabadi, Majid
    JOURNAL OF ROBOTICS, 2020, 2020 (2020)
  • [9] Model-Based Inverse Reinforcement Learning from Visual Demonstrations
    Das, Neha
    Bechtle, Sarah
    Davchev, Todor
    Jayaraman, Dinesh
    Rai, Akshara
    Meier, Franziska
    CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 1930 - 1942
  • [10] Learning Virtual Grasp with Failed Demonstrations via Bayesian Inverse Reinforcement Learning
    Xie, Xu
    Li, Changyang
    Zhang, Chi
    Zhu, Yixin
    Zhu, Song-Chun
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 1812 - 1817