Skill Reward for Safe Deep Reinforcement Learning

被引:0
|
作者
Cheng, Jiangchang [1 ]
Yu, Fumin [1 ]
Zhang, Hongliang [1 ]
Dai, Yinglong [2 ,3 ]
机构
[1] Hunan Normal Univ, Coll Informat Sci & Engn, Changsha 410081, Peoples R China
[2] Natl Univ Def Technol, Coll Liberal Arts & Sci, Changsha 410073, Peoples R China
[3] Hunan Prov Key Lab Intelligent Comp & Language In, Changsha 410081, Peoples R China
来源
UBIQUITOUS SECURITY | 2022年 / 1557卷
关键词
Reinforcement learning; Deep reinforcement learning; Reward shaping; Skill reward; Safe agent; LEVEL;
D O I
10.1007/978-981-19-0468-4_15
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning technology enables an agent to interact with the environment and learn from experience to maximize the cumulative reward of specific tasks, and get a powerful agent to solve decision optimization problems. This process is highly similar to our human learning process, that is, learning from the interaction with the environment. As we know, the behavior of an agent based on deep reinforcement learning is often unpredictable, and the agent will produce some weird and unsafe behavior sometimes. To make the behavior and the decision process of the agent explainable and controllable, this paper proposed the skill reward method that the agent can be constrained to learn some controllable and safe behaviors. When an agent finishes specific skills in the process of interaction with the environment, we can design the rewards obtained by the agent during the exploration process based on prior knowledge to make the learning process converge quickly. The skill reward can be embedded into the existing reinforcement learning algorithms. In this work, we embed the skill reward into the asynchronous advantage actor-critic (A3C) algorithm, and test the method in an Atari 2600 environment (Breakout-v4). The experiments demonstrate the effectiveness of the skill reward embedding method.
引用
收藏
页码:203 / 213
页数:11
相关论文
共 50 条
  • [31] Robot skill acquisition in assembly process using deep reinforcement learning
    Li, Fengming
    Jiang, Qi
    Zhang, Sisi
    Wei, Meng
    Song, Rui
    NEUROCOMPUTING, 2019, 345 : 92 - 102
  • [32] Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning
    Chuck, Caleb
    Chockchowwat, Supawit
    Niekum, Scott
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5572 - 5579
  • [33] A Deep Reinforcement Learning Approach for Efficient, Safe and Comfortable Driving
    Selvaraj, Dinesh Cyril
    Hegde, Shailesh
    Amati, Nicola
    Deflorio, Francesco
    Chiasserini, Carla Fabiana
    APPLIED SCIENCES-BASEL, 2023, 13 (09):
  • [34] Adaptive Cruise Control Based on Safe Deep Reinforcement Learning
    Zhao, Rui
    Wang, Kui
    Che, Wenbo
    Li, Yun
    Fan, Yuze
    Gao, Fei
    SENSORS, 2024, 24 (08)
  • [35] Intelligent Control of Aeroengines Based on Safe Deep Reinforcement Learning
    Liu, Lijun
    Li, Chaoqi
    Dai, Huangshan
    Lin, Pengfei
    Qian, Rongrong
    Yu, Zhen
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 3453 - 3458
  • [36] Formal Verification for Safe Deep Reinforcement Learning in Trajectory Generation
    Corsi, Davide
    Marchesini, Enrico
    Farinelli, Alessandro
    Fiorini, Paolo
    2020 FOURTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC 2020), 2020, : 352 - 359
  • [37] Safe deep reinforcement learning in diesel engine emission control
    Norouzi, Armin
    Shahpouri, Saeid
    Gordon, David
    Shahbakhti, Mahdi
    Koch, Charles Robert
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART I-JOURNAL OF SYSTEMS AND CONTROL ENGINEERING, 2023, 237 (08) : 1440 - 1453
  • [38] Deep reinforcement learning-based safe interaction for industrial human-robot collaboration using intrinsic reward function
    Liu, Quan
    Liu, Zhihao
    Xiong, Bo
    Xu, Wenjun
    Liu, Yang
    ADVANCED ENGINEERING INFORMATICS, 2021, 49
  • [39] Building Safe and Stable DNN Controllers using Deep Reinforcement Learning and Deep Imitation Learning
    He, Xudong
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 775 - 784
  • [40] Skill Learning with Empowerment in Reinforcement Learning
    Latyshev, A. K.
    Panov, A. I.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2024, 34 (03) : 535 - 542