Inverse Reinforcement Learning from a Gradient-based Learner

被引:0
|
作者
Ramponi, Giorgia [1 ]
Drappo, Gianluca [1 ]
Restelli, Marcello [1 ]
机构
[1] Politecn Milan, Milan, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behaviour, but we also observe part of her learning process. In this paper, we propose a new algorithm for this setting, in which the goal is to recover the reward function being optimized by an agent, given a sequence of policies produced during learning. Our approach is based on the assumption that the observed agent is updating her policy parameters along the gradient direction. Then we extend our method to deal with the more realistic scenario where we only have access to a dataset of learning trajectories. For both settings, we provide theoretical insights into our algorithms' performance. Finally, we evaluate the approach in a simulated GridWorld environment and on the MuJoCo environments, comparing it with the state-of-the-art baseline.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
    Mazumdar, Eric
    Ratliff, Lillian J.
    Fiez, Tanner
    Sastry, S. Shankar
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [2] Gradient-Based Minimization for Multi-Expert Inverse Reinforcement Learning
    Tateo, Davide
    Pirotta, Matteo
    Restelli, Marcello
    Bonarini, Andrea
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 815 - 822
  • [3] Direct gradient-based reinforcement learning
    Baxter, J
    Bartlett, PL
    ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL III: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 271 - 274
  • [4] Direct gradient-based reinforcement learning for robot behavior learning
    El-Fakdi, Andres
    Carreras, Marc
    Ridao, Pere
    INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS II, 2007, : 175 - +
  • [5] A Gradient-based reinforcement learning model of market equilibration
    He, Zhongzhi
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2023, 152
  • [6] Estimation and approximation bounds for gradient-based reinforcement learning
    Bartlett, PL
    Baxter, J
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2002, 64 (01) : 133 - 150
  • [7] A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
    Zhang, Zhen
    Wang, Dongqing
    Zhao, Dongbin
    Han, Qiaoni
    Song, Tingting
    IEEE ACCESS, 2018, 6 : 70223 - 70235
  • [8] Traffic Light Control with Policy Gradient-Based Reinforcement Learning
    Tas, Mehmet Bilge Han
    Ozkan, Kemal
    Saricicek, Inci
    Yazici, Ahmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [9] Inverse-Inverse Reinforcement Learning. How to Hide Strategy from an Adversarial Inverse Reinforcement Learner
    Pattanayak, Kunal
    Krishnamurthy, Vikram
    Berry, Christopher
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 3631 - 3636
  • [10] Optimizing thermodynamic trajectories using evolutionary and gradient-based reinforcement learning
    Beeler, Chris
    Yahorau, Uladzimir
    Coles, Rory
    Mills, Kyle
    Whitelam, Stephen
    Tamblyn, Isaac
    PHYSICAL REVIEW E, 2021, 104 (06)