Regret Bounds for Risk-Sensitive Reinforcement Learning

被引:0
|
作者
Bastani, Osbert [1 ]
Ma, Yecheng Jason [1 ]
Shen, Estelle [1 ]
Xu, Wanqiao [2 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Stanford Univ, Stanford, CA USA
关键词
VALUE-AT-RISK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
    Liang, Hao
    Luo, Zhi-Quan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [2] Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures
    Liang, Hao
    Luo, Zhi-Quan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [3] Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning
    Fei, Yingjie
    Yang, Zhuoran
    Chen, Yudong
    Wang, Zhaoran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] Cascaded Gaps: Towards Logarithmic Regret for Risk-Sensitive Reinforcement Learning
    Fei, Yingjie
    Xu, Ruitu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [5] Learning Bounds for Risk-sensitive Learning
    Lee, Jaeho
    Park, Sejun
    Shin, Jinwoo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [6] Risk-Sensitive Reinforcement Learning
    Shen, Yun
    Tobia, Michael J.
    Sommer, Tobias
    Obermayer, Klaus
    NEURAL COMPUTATION, 2014, 26 (07) : 1298 - 1328
  • [7] Risk-sensitive reinforcement learning
    Mihatsch, O
    Neuneier, R
    MACHINE LEARNING, 2002, 49 (2-3) : 267 - 290
  • [8] Risk-Sensitive Reinforcement Learning
    Oliver Mihatsch
    Ralph Neuneier
    Machine Learning, 2002, 49 : 267 - 290
  • [9] A Tighter Problem-Dependent Regret Bound for Risk-Sensitive Reinforcement Learning
    Hu, Xiaoyan
    Leung, Ho-Fung
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [10] On tight bounds for function approximation error in risk-sensitive reinforcement learning
    Karmakar, Prasenjit
    Bhatnagar, Shalabh
    SYSTEMS & CONTROL LETTERS, 2021, 150