Anomalous State Sequence Modeling to Enhance Safety in Reinforcement Learning

被引:0
|
作者
Kweider, Leen [1 ]
Abou Kassem, Maissa [1 ]
Sandouk, Ubai [2 ]
机构
[1] Damascus Univ, Fac Informat Technol, Dept Artificial Intelligence, Damascus, Syria
[2] Damascus Univ, Fac Informat Technol, Dept Software Engn, Damascus, Syria
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Safety; Anomaly detection; Reinforcement learning; Artificial intelligence; Optimization; Uncertainty; Measurement uncertainty; Costs; Decision making; Training; AI safety; reinforcement learning; anomaly detection; sequence modeling; risk-averse policy; reward shaping;
D O I
10.1109/ACCESS.2024.3486549
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The deployment of artificial intelligence (AI) in decision-making applications requires ensuring an appropriate level of safety and reliability, particularly in changing environments that contain a large number of unknown observations. To address this challenge, we propose a novel safe reinforcement learning (RL) approach that utilizes an anomalous state sequence to enhance RL safety. Our proposed solution Safe Reinforcement Learning with Anomalous State Sequences (AnoSeqs) consists of two stages. First, we train an agent in a non-safety-critical offline 'source' environment to collect safe state sequences. Next, we use these safe sequences to build an anomaly detection model that can detect potentially unsafe state sequences in a 'target' safety-critical environment where failures can have high costs. The estimated risk from the anomaly detection model is utilized to train a risk-averse RL policy in the target environment; this involves adjusting the reward function to penalize the agent for visiting anomalous states deemed unsafe by our anomaly model. In experiments on multiple safety-critical benchmarking environments including self-driving cars, our solution approach successfully learns safer policies and proves that sequential anomaly detection can provide an effective supervisory signal for training safety-aware RL agents.
引用
收藏
页码:157140 / 157148
页数:9
相关论文
共 50 条
  • [1] Decision Transformer: Reinforcement Learning via Sequence Modeling
    Chen, Lili
    Lu, Kevin
    Rajeswaran, Aravind
    Lee, Kimin
    Grover, Aditya
    Laskin, Michael
    Abbeel, Pieter
    Srinivas, Aravind
    Mordatch, Igor
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Addressing Optimism Bias in Sequence Modeling for Reinforcement Learning
    Villaflor, Adam
    Huang, Zhe
    Pande, Swapnil
    Dolan, John
    Schneider, Jeff
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [3] Optimizing Attention for Sequence Modeling via Reinforcement Learning
    Fei, Hao
    Zhang, Yue
    Ren, Yafeng
    Ji, Donghong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3612 - 3621
  • [4] Offline Reinforcement Learning as One Big Sequence Modeling Problem
    Janner, Michael
    Li, Qiyang
    Levine, Sergey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Multi-Agent Reinforcement Learning is A Sequence Modeling Problem
    Wen, Muning
    Kuba, Jakub Grudzien
    Lin, Runji
    Zhang, Weinan
    Wen, Ying
    Wang, Jun
    Yang, Yaodong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] VISITOR: Visual Interactive State Sequence Exploration for Reinforcement Learning
    Metz, Yannick
    Bykovets, Eugene
    Joos, Lucas
    Keim, Daniel
    El-Assady, Mennatallah
    COMPUTER GRAPHICS FORUM, 2023, 42 (03) : 397 - 408
  • [7] Reinforcement learning inclusion to alter design sequence of finite element modeling
    Ciklamini, Marek
    Cejnek, Matous
    MULTISCALE AND MULTIDISCIPLINARY MODELING EXPERIMENTS AND DESIGN, 2024, 7 (05) : 4721 - 4734
  • [8] Reinforcement learning under temporal logic constraints as a sequence modeling problem
    Tian, Daiying
    Fang, Hao
    Yang, Qingkai
    Yu, Haoyong
    Liang, Wenyu
    Wu, Yan
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2023, 161
  • [9] Inertia-Constrained Reinforcement Learning to Enhance Human Motor Control Modeling
    Korivand, Soroush
    Jalili, Nader
    Gong, Jiaqi
    SENSORS, 2023, 23 (05)
  • [10] State representation modeling for deep reinforcement learning based recommendation
    Liu, Feng
    Tang, Ruiming
    Li, Xutao
    Zhang, Weinan
    Ye, Yunming
    Chen, Haokun
    Guo, Huifeng
    Zhang, Yuzhou
    He, Xiuqiang
    KNOWLEDGE-BASED SYSTEMS, 2020, 205