Probabilistic Counterexample Guidance for Safer Reinforcement Learning

被引:1
|
作者
Ji, Xiaotong [1 ]
Filieri, Antonio [1 ]
机构
[1] Imperial Coll London, Dept Comp, London SW7 2AZ, England
关键词
Safe reinforcement learning; Probabilistic model checking; Counterexample guidance;
D O I
10.1007/978-3-031-43835-6_22
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Safe exploration aims at addressing the limitations of Reinforcement Learning (RL) in safety-critical scenarios, where failures during trial-and-error learning may incur high costs. Several methods exist to incorporate external knowledge or to use proximal sensor data to limit the exploration of unsafe states. However, reducing exploration risks in unknown environments, where an agent must discover safety threats during exploration, remains challenging. In this paper, we target the problem of safe exploration by guiding the training with counterexamples of the safety requirement. Our method abstracts both continuous and discrete state-space systems into compact abstract models representing the safety-relevant knowledge acquired by the agent during exploration. We then exploit probabilistic counter-example generation to construct minimal simulation submodels eliciting safety requirement violations, where the agent can efficiently train offline to refine its policy towards minimising the risk of safety violations during the subsequent online exploration. We demonstrate our method's effectiveness in reducing safety violations during online exploration in preliminary experiments by an average of 40.3% compared with QL and DQN standard algorithms and 29.1% compared with previous related work, while achieving comparable cumulative rewards with respect to unrestricted exploration and alternative approaches.
引用
收藏
页码:311 / 328
页数:18
相关论文
共 50 条
  • [21] Probabilistic inference for determining options in reinforcement learning
    Daniel, Christian
    van Hoof, Herke
    Peters, Jan
    Neumann, Gerhard
    MACHINE LEARNING, 2016, 104 (2-3) : 337 - 357
  • [22] Probabilistic Policy Reuse for Safe Reinforcement Learning
    Garcia, Javier
    Fernandez, Fernando
    ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2019, 13 (03)
  • [23] Verified Probabilistic Policies for Deep Reinforcement Learning
    Bacci, Edoardo
    Parker, David
    NASA FORMAL METHODS (NFM 2022), 2022, 13260 : 193 - 212
  • [24] Probabilistic Inference in Reinforcement Learning Done Right
    Tarbouriech, Jean
    Lattimore, Tor
    O'Donoghue, Brendan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [25] PROBABILISTIC REINFORCEMENT LEARNING IN SCHIZOPHRENIA: RELATIONSHIP TO AMOTIVATION
    Dowd, Erin Connor
    Barch, Deanna Marie
    SCHIZOPHRENIA BULLETIN, 2011, 37 : 135 - 135
  • [26] Testing probabilistic equivalence through Reinforcement Learning
    Desharnais, Josee
    Laviolette, Francois
    Zhioua, Sami
    INFORMATION AND COMPUTATION, 2013, 227 : 21 - 57
  • [27] Probabilistic inference for determining options in reinforcement learning
    Christian Daniel
    Herke van Hoof
    Jan Peters
    Gerhard Neumann
    Machine Learning, 2016, 104 : 337 - 357
  • [28] Reinforcement learning in a probabilistic learning task without time constraints
    Jablonska, Judyta
    Szumiec, Lukasz
    Parkitna, Jan Rodriguez
    PHARMACOLOGICAL REPORTS, 2019, 71 (06) : 1310 - 1310
  • [29] Counterexample Generation in Probabilistic Model Checking
    Han, Tingting
    Katoen, Joost-Pieter
    Damman, Berteun
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2009, 35 (02) : 241 - 257
  • [30] Automatic Ultrasound Guidance Based on Deep Reinforcement Learning
    Jarosik, Piotr
    Lewandowski, Marcin
    2019 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IUS), 2019, : 475 - 478