Gradient-Based Inverse Risk-Sensitive Reinforcement Learning

被引:0
|
作者
Mazumdar, Eric [1 ]
Ratliff, Lillian J. [2 ]
Fiez, Tanner [2 ]
Sastry, S. Shankar [1 ]
机构
[1] Univ Calif Berkeley, Elect Engn & Comp Sci Dept, Berkeley, CA 94720 USA
[2] Univ Washington, Elect Engn Dept, Seattle, WA 98195 USA
关键词
PROSPECT-THEORY; CHOICE; DECISIONS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We address the problem of inverse reinforcement learning in Markov decision processes where the agent is risk-sensitive. In particular, we model risk-sensitivity in a reinforcement learning framework by making use of models of human decision-making having their origins in behavioral psychology and economics. We propose a gradient-based inverse reinforcement learning algorithm that minimizes a loss function defined on the observed behavior. We demonstrate the performance of the proposed technique on two examples, the first of which is the canonical Grid World example and the second of which is an MDP modeling passengers' decisions regarding ride-sharing. In the latter, we use pricing and travel time data from a ride-sharing company to construct the transition probabilities and rewards of the MDP.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Robust Reinforcement Learning for Risk-Sensitive Linear Quadratic Gaussian Control
    Cui, Leilei
    Basar, Tamer
    Jiang, Zhong-Ping
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (11) : 7678 - 7693
  • [42] A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
    Zhang, Zhen
    Wang, Dongqing
    Zhao, Dongbin
    Han, Qiaoni
    Song, Tingting
    IEEE ACCESS, 2018, 6 : 70223 - 70235
  • [43] Traffic Light Control with Policy Gradient-Based Reinforcement Learning
    Tas, Mehmet Bilge Han
    Ozkan, Kemal
    Saricicek, Inci
    Yazici, Ahmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [44] Reinforcement-Learning-Based Risk-Sensitive Optimal Feedback Mechanisms of Biological Motor Control
    Cui, Leilei
    Pang, Bo
    Jiang, Zhong-Ping
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 7944 - 7949
  • [45] Learning Bounds for Risk-sensitive Learning
    Lee, Jaeho
    Park, Sejun
    Shin, Jinwoo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [46] Risk-sensitive online learning
    Even-Dar, Eyal
    Kearns, Michael
    Wortman, Jennifer
    ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2006, 4264 : 199 - 213
  • [47] Exponential TD Learning: A Risk-Sensitive Actor-Critic Reinforcement Learning Algorithm
    Noorani, Erfaun
    Mavridis, Christos N.
    Baras, John S.
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 4104 - 4109
  • [48] One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning
    Rigter, Marc
    Lacerda, Bruno
    Hawes, Nick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning
    Fei, Yingjie
    Yang, Zhuoran
    Chen, Yudong
    Wang, Zhaoran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [50] Sample-Efficient Multimodal Dynamics Modeling for Risk-Sensitive Reinforcement Learning
    Yashima, Ryota
    Yamaguchi, Akihiko
    Hashimoto, Koichi
    2022 8TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND ROBOTICS ENGINEERING (ICMRE 2022), 2022, : 21 - 27