Dynamic Inverse Reinforcement Learning for Characterizing Animal Behavior

被引:0
|
作者
Ashwood, Zoe C. [1 ,2 ]
Jha, Aditi [1 ,3 ]
Pillow, JonathanW. [1 ]
机构
[1] Princeton Univ, Princeton Neurosci Inst, Princeton, NJ 08544 USA
[2] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
[3] Princeton Univ, Dept Elect & Comp Engn, Princeton, NJ USA
关键词
MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding decision-making is a core objective in both neuroscience and psychology, and computational models have often been helpful in the pursuit of this goal. While many models have been developed for characterizing behavior in binary decision-making and bandit tasks, comparatively little work has focused on animal decision-making in more complex tasks, such as navigation through a maze. Inverse reinforcement learning (IRL) is a promising approach for understanding such behavior, as it aims to infer the unknown reward function of an agent from its observed trajectories through state space. However, IRL has yet to be widely applied in neuroscience. One potential reason for this is that existing IRL frameworks assume that an agent's reward function is fixed over time. To address this shortcoming, we introduce dynamic inverse reinforcement learning (DIRL), a novel IRL framework that allows for time-varying intrinsic rewards. Our method parametrizes the unknown reward function as a time-varying linear combination of spatial reward maps (which we refer to as "goal maps"). We develop an efficient inference method for recovering this dynamic reward function from behavioral data. We demonstrate DIRL in simulated experiments and then apply it to a dataset of mice exploring a labyrinth. Our method returns interpretable reward functions for two separate cohorts of mice, and provides a novel characterization of exploratory behavior. We expect DIRL to have broad applicability in neuroscience, and to facilitate the design of biologically-inspired reward functions for training artificial agents.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Inverse Constrained Reinforcement Learning
    Malik, Shehryar
    Anwar, Usman
    Aghasi, Alireza
    Ahmed, Ali
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [32] Identifiability in inverse reinforcement learning
    Cao, Haoyang
    Cohen, Samuel N.
    Szpruch, Lukasz
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [33] Inverse reinforcement learning with evaluation
    da Silva, Valdinei Freire
    Reali Costa, Anna Helena
    Lima, Pedro
    2006 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-10, 2006, : 4246 - +
  • [34] A survey of inverse reinforcement learning
    Adams, Stephen
    Cody, Tyler
    Beling, Peter A.
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (06) : 4307 - 4346
  • [35] Survey on Inverse Reinforcement Learning
    Zhang L.-H.
    Liu Q.
    Huang Z.-G.
    Zhu F.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (10): : 4772 - 4803
  • [36] A survey of inverse reinforcement learning
    Stephen Adams
    Tyler Cody
    Peter A. Beling
    Artificial Intelligence Review, 2022, 55 : 4307 - 4346
  • [37] Dynamic QoS Prediction With Intelligent Route Estimation Via Inverse Reinforcement Learning
    Li, Jiahui
    Wu, Hao
    He, Qiang
    Zhao, Yiji
    Wang, Xin
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (02) : 509 - 523
  • [38] Evolving malware detection through instant dynamic graph inverse reinforcement learning
    Liu, Chen
    Li, Bo
    Liu, Xudong
    Li, Chunpei
    Bao, Jingru
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [39] Reinforcement Learning in Information Cascades Based on Dynamic User Behavior
    Chen, Mengnan
    Zheng, Qipeng P.
    Boginski, Vladimir
    Pasiliao, Eduardo L.
    COMPUTATIONAL DATA AND SOCIAL NETWORKS, 2019, 11917 : 148 - 154
  • [40] Reinforcement learning of dynamic behavior by using recurrent neural networks
    Ahmet Onat
    Hajime Kita
    Yoshikazu Nishikawa
    Artificial Life and Robotics, 1997, 1 (3) : 117 - 121