Inverse Reinforcement Learning from a Gradient-based Learner

被引:0
|
作者
Ramponi, Giorgia [1 ]
Drappo, Gianluca [1 ]
Restelli, Marcello [1 ]
机构
[1] Politecn Milan, Milan, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behaviour, but we also observe part of her learning process. In this paper, we propose a new algorithm for this setting, in which the goal is to recover the reward function being optimized by an agent, given a sequence of policies produced during learning. Our approach is based on the assumption that the observed agent is updating her policy parameters along the gradient direction. Then we extend our method to deal with the more realistic scenario where we only have access to a dataset of learning trajectories. For both settings, we provide theoretical insights into our algorithms' performance. Finally, we evaluate the approach in a simulated GridWorld environment and on the MuJoCo environments, comparing it with the state-of-the-art baseline.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Categorical Foundations of Gradient-Based Learning
    Cruttwell, Geoffrey S. H.
    Gavranovic, Bruno
    Ghani, Neil
    Wilson, Paul
    Zanasi, Fabio
    PROGRAMMING LANGUAGES AND SYSTEMS, ESOP 2022, 2022, 13240 : 1 - 28
  • [22] Gradient-Based Learning of Finite Automata
    del Pozo Romero, Juan Fdez
    Lago-Fernandez, Luis F.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VIII, 2023, 14261 : 294 - 305
  • [23] On Gradient-Based Learning in Continuous Games
    Mazumdar, Eric
    Ratliff, Lillian J.
    Sastry, S. Shankar
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (01): : 103 - 131
  • [24] Topological Gradient-based Competitive Learning
    Barbiero, Pietro
    Ciravegna, Gabriele
    Randazzo, Vincenzo
    Pasero, Eros
    Cirrincione, Giansalvo
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [25] Object recognition with gradient-based learning
    LeCun, Y
    Haffner, P
    Bottou, L
    Bengio, Y
    SHAPE, CONTOUR AND GROUPING IN COMPUTER VISION, 1999, 1681 : 319 - 345
  • [26] Failures of Gradient-Based Deep Learning
    Shalev-Shwartz, Shai
    Shamir, Ohad
    Shammah, Shaked
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [27] Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
    Tschiatschek, Sebastian
    Ghosh, Ahana
    Haug, Luis
    Devidze, Rati
    Singla, Adish
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [28] On the use of the policy gradient and Hessian in inverse reinforcement learning
    Metelli, Alberto Maria
    Pirotta, Matteo
    Restelli, Marcello
    INTELLIGENZA ARTIFICIALE, 2020, 14 (01) : 117 - 150
  • [29] Inverse Reinforcement Learning through Policy Gradient Minimization
    Pirotta, Matteo
    Restelli, Marcello
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1993 - 1999
  • [30] GLR: Gradient-Based Learning Rate Scheduler
    Spatafora, Maria Ausilia Napoli
    Ortis, Alessandro
    Battiato, Sebastiano
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I, 2023, 14233 : 269 - 281