Regret Bounds for Risk-Sensitive Reinforcement Learning

被引:0
|
作者
Bastani, Osbert [1 ]
Ma, Yecheng Jason [1 ]
Shen, Estelle [1 ]
Xu, Wanqiao [2 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Stanford Univ, Stanford, CA USA
关键词
VALUE-AT-RISK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Risk-sensitive reinforcement learning applied to control under constraints
    Geibel, P
    Wysotzki, F
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 : 81 - 108
  • [32] Risk-sensitive reinforcement learning applied to control under constraints
    Geibel, P. (PGEIBEL@UOS.DE), 1600, American Association for Artificial Intelligence (24):
  • [33] Risk-Sensitive Portfolio Management by using Distributional Reinforcement Learning
    Harnpadungkij, Thammasorn
    Chaisangmongkon, Warasinee
    Phunchongharn, Phond
    2019 IEEE 10TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2019), 2019, : 110 - 115
  • [34] Risk-Sensitive Reinforcement Learning for URLLC Traffic in Wireless Networks
    Ben Khalifa, Nesrine
    Assaad, Mohamad
    Debbah, Merouane
    2019 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2019,
  • [35] Bounds for a risk-sensitive homing problem☆
    Makasu, Cloud
    AUTOMATICA, 2024, 163
  • [36] Regret Bounds for Learning State Representations in Reinforcement Learning
    Ortner, Ronald
    Pirotta, Matteo
    Fruit, Ronan
    Lazaric, Alessandro
    Maillard, Odalric-Ambrym
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [37] Variational Bayesian Reinforcement Learning with Regret Bounds
    O'Donoghue, Brendan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [38] Non-stationary Risk-Sensitive Reinforcement Learning: Near-Optimal Dynamic Regret, Adaptive Detection, and Separation Design
    Ding, Yuhao
    Jin, Ming
    Lavaei, Javad
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7405 - 7413
  • [39] Risk-Sensitive Reinforcement Learning Part I: Constrained Optimization Framework
    Prashanth, L. A.
    2019 FIFTH INDIAN CONTROL CONFERENCE (ICC), 2019, : 9 - 9
  • [40] Risk-Sensitive Reinforcement Learning Via Entropic-VaR Optimization
    Ni, Xinyi
    Lai, Lifeng
    2022 56TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2022, : 953 - 959