Mask-Attention A3C: Visual Explanation of Action-State Value in Deep Reinforcement Learning

被引:0
|
作者
Itaya, Hidenori [1 ]
Hirakawa, Tsubasa [2 ]
Yamashita, Takayoshi [2 ]
Fujiyoshi, Hironobu [3 ]
Sugiura, Komei [4 ]
机构
[1] Chubu Univ, Dept Comp Sci, Kasugai, Aichi 4878501, Japan
[2] Chubu Univ, Ctr Math Sci & Artificial Intelligence, Kasugai, Aichi 4878501, Japan
[3] Chubu Univ, Dept Robot, Kasugai, Aichi 4878501, Japan
[4] Keio Univ, Dept Comp Sci, Yokohama, Kanagawa 2238522, Japan
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Deep reinforcement learning; explainable AI; visual explanation; video games; robot manipulation;
D O I
10.1109/ACCESS.2024.3416179
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (DRL) can learn an agent's optimal behavior from the experience it gains through interacting with its environment. However, since the decision-making process of DRL agents is a black-box, it is difficult for users to understand the reasons for the agents' actions. To date, conventional visual explanation methods for DRL agents have focused only on the policy and not on the state value. In this work, we propose a DRL method called Mask-Attention A3C (Mask A3C) to analyze agents' decision-making by focusing on both the policy and value branches, which have different outputs. Inspired by the Actor-Critic method, our method introduces an Attention mechanism that applies mask processing to the feature map of the policy and value branches using mask-attention, which is a heat-map representation of the basis for judging the policy and state values. We also propose the introduction of a Mask-attention Loss to obtain highly interpretable mask-attention. By introducing this loss function, the agent learns not to gaze at regions that do not affect its decision-making. Our evaluations with Atari 2600 as a video game strategy task and robot manipulation as a robot control task showed that visualizing the mask-attention of an agent during its action selection facilitates the analysis of the agent's decision-making. We also investigated the effect of Mask-attention Loss and confirmed that it is useful for analyzing agents' decision-making. In addition, we showed that these mask-attentions are highly interpretable to the user by conducting a user survey on the prediction of the agent's behavior.
引用
收藏
页码:86553 / 86571
页数:19
相关论文
共 7 条
  • [1] Better Deep Visual Attention with Reinforcement Learning in Action Recognition
    Wang, Gang
    Wang, Wenmin
    Wang, Jingzhuo
    Bu, Yaohua
    2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017,
  • [2] A3C Deep Reinforcement Learning Model Compression and Knowledge Extraction
    Zhang J.
    Wang Z.
    Ren Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (06): : 1373 - 1384
  • [3] Air Combat Maneuver Decision Method Based on A3C Deep Reinforcement Learning
    Fan, Zihao
    Xu, Yang
    Kang, Yuhang
    Luo, Delin
    MACHINES, 2022, 10 (11)
  • [4] Visual Explanation With Action Query Transformer in Deep Reinforcement Learning and Visual Feedback via Augmented Reality
    Itaya, Hidenori
    Yin, Wantao
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    Sugiura, Komei
    IEEE ACCESS, 2025, 13 : 56338 - 56354
  • [5] Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning
    Itaya, Hidenori
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    Sugiura, Komei
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [6] Deep Reinforcement Learning with Importance Weighted A3C for QoE enhancement in Video Delivery Services
    Naresh, Mandan
    Saxena, Paresh
    Gupta, Manik
    2023 IEEE 24TH INTERNATIONAL SYMPOSIUM ON A WORLD OF WIRELESS, MOBILE AND MULTIMEDIA NETWORKS, WOWMOM, 2023, : 97 - 106
  • [7] Resource Pricing and Allocation in MEC Enabled Blockchain Systems: An A3C Deep Reinforcement Learning Approach
    Du, Jianbo
    Cheng, Wenjie
    Lu, Guangyue
    Cao, Haotong
    Chu, Xiaoli
    Zhang, Zhicai
    Wang, Junxuan
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (01): : 33 - 44