Mask-Attention A3C: Visual Explanation of Action-State Value in Deep Reinforcement Learning

被引：0

作者：

Itaya, Hidenori ^{[1
]}

Hirakawa, Tsubasa ^{[2
]}

Yamashita, Takayoshi ^{[2
]}

Fujiyoshi, Hironobu ^{[3
]}

Sugiura, Komei ^{[4
]}

机构：

[1] Chubu Univ, Dept Comp Sci, Kasugai, Aichi 4878501, Japan

[2] Chubu Univ, Ctr Math Sci & Artificial Intelligence, Kasugai, Aichi 4878501, Japan

[3] Chubu Univ, Dept Robot, Kasugai, Aichi 4878501, Japan

[4] Keio Univ, Dept Comp Sci, Yokohama, Kanagawa 2238522, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Deep reinforcement learning; explainable AI; visual explanation; video games; robot manipulation;

D O I：

10.1109/ACCESS.2024.3416179

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep reinforcement learning (DRL) can learn an agent's optimal behavior from the experience it gains through interacting with its environment. However, since the decision-making process of DRL agents is a black-box, it is difficult for users to understand the reasons for the agents' actions. To date, conventional visual explanation methods for DRL agents have focused only on the policy and not on the state value. In this work, we propose a DRL method called Mask-Attention A3C (Mask A3C) to analyze agents' decision-making by focusing on both the policy and value branches, which have different outputs. Inspired by the Actor-Critic method, our method introduces an Attention mechanism that applies mask processing to the feature map of the policy and value branches using mask-attention, which is a heat-map representation of the basis for judging the policy and state values. We also propose the introduction of a Mask-attention Loss to obtain highly interpretable mask-attention. By introducing this loss function, the agent learns not to gaze at regions that do not affect its decision-making. Our evaluations with Atari 2600 as a video game strategy task and robot manipulation as a robot control task showed that visualizing the mask-attention of an agent during its action selection facilitates the analysis of the agent's decision-making. We also investigated the effect of Mask-attention Loss and confirmed that it is useful for analyzing agents' decision-making. In addition, we showed that these mask-attentions are highly interpretable to the user by conducting a user survey on the prediction of the agent's behavior.

引用

页码：86553 / 86571

页数：19

共 7 条

[1] Better Deep Visual Attention with Reinforcement Learning in Action Recognition
Wang, Gang
Wang, Wenmin
Wang, Jingzhuo
Bu, Yaohua
2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017,
[2] A3C Deep Reinforcement Learning Model Compression and Knowledge Extraction
Zhang J.
Wang Z.
Ren Y.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (06): : 1373 - 1384
[3] Air Combat Maneuver Decision Method Based on A3C Deep Reinforcement Learning
Fan, Zihao
Xu, Yang
Kang, Yuhang
Luo, Delin
MACHINES, 2022, 10 (11)
[4] Visual Explanation With Action Query Transformer in Deep Reinforcement Learning and Visual Feedback via Augmented Reality
Itaya, Hidenori
Yin, Wantao
Hirakawa, Tsubasa
Yamashita, Takayoshi
Fujiyoshi, Hironobu
Sugiura, Komei
IEEE ACCESS, 2025, 13 : 56338 - 56354
[5] Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning
Itaya, Hidenori
Hirakawa, Tsubasa
Yamashita, Takayoshi
Fujiyoshi, Hironobu
Sugiura, Komei
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[6] Deep Reinforcement Learning with Importance Weighted A3C for QoE enhancement in Video Delivery Services
Naresh, Mandan
Saxena, Paresh
Gupta, Manik
2023 IEEE 24TH INTERNATIONAL SYMPOSIUM ON A WORLD OF WIRELESS, MOBILE AND MULTIMEDIA NETWORKS, WOWMOM, 2023, : 97 - 106
[7] Resource Pricing and Allocation in MEC Enabled Blockchain Systems: An A3C Deep Reinforcement Learning Approach
Du, Jianbo
Cheng, Wenjie
Lu, Guangyue
Cao, Haotong
Chu, Xiaoli
Zhang, Zhicai
Wang, Junxuan
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (01): : 33 - 44

← 1 →