ATTEXPLAINER: Explain Transformer via Attention by Reinforcement Learning

被引:0
|
作者
Niu, Runliang [1 ]
Wei, Zhepei [1 ]
Wang, Yan [1 ,2 ]
Wang, Qi [1 ]
机构
[1] Jilin Univ, Sch Artificial Intelligence, Changchun, Peoples R China
[2] Jilin Univ, Minist Educ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer and its variants, built based on attention mechanisms, have recently achieved remarkable performance in many NLP tasks. Most existing works on Transformer explanation tend to reveal and utilize the attention matrix with human subjective intuitions in a qualitative manner. However, the huge size of dimensions directly challenges these methods to quantitatively analyze the attention matrix. Therefore, in this paper, we propose a novel reinforcement learning (RL) based framework for Transformer explanation via attention matrix, namely ATTEXPLAINER. The RL agent learns to perform step-by-step masking operations by observing the change in attention matrices. We have adapted our method to two scenarios, perturbation-based model explanation and text adversarial attack. Experiments on three widely used text classification benchmarks validate the effectiveness of the proposed method compared to state-of-the-art baselines. Additional studies show that our method is highly transferable and consistent with human intuition. The code of this paper is available at https://github.com/niuzaisheng/AttExplainer.
引用
收藏
页码:724 / 731
页数:8
相关论文
共 50 条
  • [1] Decision Transformer: Reinforcement Learning via Sequence Modeling
    Chen, Lili
    Lu, Kevin
    Rajeswaran, Aravind
    Lee, Kimin
    Grover, Aditya
    Laskin, Michael
    Abbeel, Pieter
    Srinivas, Aravind
    Mordatch, Igor
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets
    Badrinath, Anirudhan
    Flet-Berliac, Yannis
    Nie, Allen
    Brunskill, Emma
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Optimizing Attention for Sequence Modeling via Reinforcement Learning
    Fei, Hao
    Zhang, Yue
    Ren, Yafeng
    Ji, Donghong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3612 - 3621
  • [4] Remote sensing image caption generation via transformer and reinforcement learning
    Shen, Xiangqing
    Liu, Bing
    Zhou, Yong
    Zhao, Jiaqi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (35-36) : 26661 - 26682
  • [5] Deep reinforcement learning navigation via decision transformer in autonomous driving
    Ge, Lun
    Zhou, Xiaoguang
    Li, Yongqiang
    Wang, Yongcong
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [6] Causal Decision Transformer for Recommender Systems via Offline Reinforcement Learning
    Wang, Siyu
    Chen, Xiaocong
    Jannach, Dietmar
    Yao, Lina
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1599 - 1608
  • [7] Remote sensing image caption generation via transformer and reinforcement learning
    Xiangqing Shen
    Bing Liu
    Yong Zhou
    Jiaqi Zhao
    Multimedia Tools and Applications, 2020, 79 : 26661 - 26682
  • [8] Reinforcement learning using fully connected, attention, and transformer models in knapsack problem solving
    Yildiz, Beytullah
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (09):
  • [9] Attention-Aware Face Hallucination via Deep Reinforcement Learning
    Cao, Qingxing
    Lin, Liang
    Shi, Yukai
    Liang, Xiaodan
    Li, Guanbin
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1656 - 1664
  • [10] TransMap: An Efficient CGRA Mapping Framework via Transformer and Deep Reinforcement Learning
    Li, Jingyuan
    Dai, Yuan
    Hu, Yihan
    Li, Jiangnan
    Yin, Wenbo
    Tao, Jun
    Wang, Lingli
    2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 626 - 633