Learning From Demonstrations: A Computationally Efficient Inverse Reinforcement Learning Approach With Simplified Implementation

被引：0

作者：

Lin, Yanbin ^{[1
]}

Ni, Zhen ^{[1
]}

Zhong, Xiangnan ^{[1
]}

机构：

[1] Florida Atlantic Univ, Dept Elect Engn & Comp Sci, Boca Raton, FL 33431 USA

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2025年

基金：

美国国家科学基金会;

关键词：

Trajectory; Reinforcement learning; Training; Optimization; Heuristic algorithms; Approximation algorithms; Markov decision processes; Iterative methods; Imitation learning; Computational modeling; Actor critic methods; inverse reinforcement learning; neural networks; reward recovering; featurization network; and online optimization;

D O I：

10.1109/TETCI.2025.3526502

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) research usually requires a reward function from sophisticated domain knowledge to perform well. Inverse reinforcement learning (IRL) methods provide the opportunity to recover such reward functions from existing expert demonstrations. However, current IRL methods require intensive computation and excessive memory storage when the state space increases. To this end, we propose a computationally efficient inverse reinforcement learning (e-IRL) approach to 1) simplify the gradient algorithm for the reward network; 2) implement the loss function with feature expectation instead of state visiting frequency; and 3) enable fast, accurate, and automatic feature processing. Specifically, we design a new featurization network to accommodate trajectories and automatically output the aligned feature vectors without human intervention. In addition, this proposed approach derives simplified algorithm formulas to obtain a gradient of loss function to update weights for the reward network. This approach eliminates repeated calculation and excessive storage of gradient parameters from demonstrated trajectories in literature. The feature expectation computation for both the expert (one-shot) and learner (streamlined) is employed to efficiently obtain the important gradient parameters. Three examples are considered to validate the effectiveness of the proposed method, which shows better performance in terms of average return, required average number of steps, and required number of demonstrations than comparable methods.

引用

页数：13

共 50 条

[1] An Efficient Unified Approach Using Demonstrations for Inverse Reinforcement Learning
Hwang, Maxwell
Jiang, Wei-Cheng
Chen, Yu-Jen
Hwang, Kao-Shing
Tseng, Yi-Chia
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (03) : 444 - 452
[2] An Unified Approach to Inverse Reinforcement Learning by Oppositive Demonstrations
Hwang, Kao-Shing
Jiang, Wei-Cheng
Tseng, Yi-Chia
PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 1664 - 1668
[3] Learning Fairness from Demonstrations via Inverse Reinforcement Learning
Blandin, Jack
Kash, Ian
PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024, 2024, : 51 - 61
[4] Inverse Reinforcement Learning of Interaction Dynamics from Demonstrations
Hussein, Mostafa
Begum, Momotaz
Petrik, Marek
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 2267 - 2274
[5] Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
Scheller, Christian
Schraner, Yanick
Vogel, Manfred
NEURIPS 2019 COMPETITION AND DEMONSTRATION TRACK, VOL 123, 2019, 123 : 67 - 76
[6] Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations
Chen, Letian
Jayanthi, Sravan
Paleja, Rohan
Martin, Daniel
Zakharov, Viacheslav
Gombolay, Matthew
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 2083 - 2094
[7] Analysis of Inverse Reinforcement Learning with Perturbed Demonstrations
Melo, Francisco S.
Lopes, Manuel
Ferreira, Ricardo
ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 349 - 354
[8] Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
Mourad, Nafee
Ezzeddine, Ali
Nadjar Araabi, Babak
Nili Ahmadabadi, Majid
JOURNAL OF ROBOTICS, 2020, 2020 (2020)
[9] Model-Based Inverse Reinforcement Learning from Visual Demonstrations
Das, Neha
Bechtle, Sarah
Davchev, Todor
Jayaraman, Dinesh
Rai, Akshara
Meier, Franziska
CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 1930 - 1942
[10] Learning Virtual Grasp with Failed Demonstrations via Bayesian Inverse Reinforcement Learning
Xie, Xu
Li, Changyang
Zhang, Chi
Zhu, Yixin
Zhu, Song-Chun
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 1812 - 1817

← 1 2 3 4 5 →