ATTEXPLAINER: Explain Transformer via Attention by Reinforcement Learning

被引:0
|
作者
Niu, Runliang [1 ]
Wei, Zhepei [1 ]
Wang, Yan [1 ,2 ]
Wang, Qi [1 ]
机构
[1] Jilin Univ, Sch Artificial Intelligence, Changchun, Peoples R China
[2] Jilin Univ, Minist Educ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer and its variants, built based on attention mechanisms, have recently achieved remarkable performance in many NLP tasks. Most existing works on Transformer explanation tend to reveal and utilize the attention matrix with human subjective intuitions in a qualitative manner. However, the huge size of dimensions directly challenges these methods to quantitatively analyze the attention matrix. Therefore, in this paper, we propose a novel reinforcement learning (RL) based framework for Transformer explanation via attention matrix, namely ATTEXPLAINER. The RL agent learns to perform step-by-step masking operations by observing the change in attention matrices. We have adapted our method to two scenarios, perturbation-based model explanation and text adversarial attack. Experiments on three widely used text classification benchmarks validate the effectiveness of the proposed method compared to state-of-the-art baselines. Additional studies show that our method is highly transferable and consistent with human intuition. The code of this paper is available at https://github.com/niuzaisheng/AttExplainer.
引用
收藏
页码:724 / 731
页数:8
相关论文
共 50 条
  • [21] Efficient Spatiotemporal Transformer for Robotic Reinforcement Learning
    Yang, Yiming
    Xing, Dengpeng
    Xu, Bo
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 7982 - 7989
  • [22] Application of Transformer for Encoding States in Reinforcement Learning
    D. A. Kozlov
    Optoelectronics, Instrumentation and Data Processing, 2024, 60 (5) : 610 - 617
  • [23] Fast Vision Transformer via Additive Attention
    Wen, Yang
    Chen, Samuel
    Shrestha, Abhishek Krishna
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 573 - 574
  • [24] Multi-agent deep reinforcement learning via double attention and adaptive entropy
    Wu, Pei-Liang
    Yuan, Xu-Dong
    Mao, Bing-Yi
    Chen, Wen-Bai
    Gao, Guo-Wei
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2024, 41 (10): : 1930 - 1936
  • [25] Stabilizing Voltage in Power Distribution Networks via Multi-Agent Reinforcement Learning with Transformer
    Wang, Minrui
    Feng, Mingxiao
    Zhou, Wengang
    Li, Houqiang
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1899 - 1909
  • [26] Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
    Chebotar, Yevgen
    Vuong, Quan
    Irpan, Alex
    Hausman, Karol
    Xia, Fei
    Lu, Yao
    Kumar, Aviral
    Yu, Tianhe
    Herzog, Alexander
    Pertsch, Karl
    Gopalakrishnan, Keerthana
    Ibarz, Julian
    Nachum, Ofir
    Sontakke, Sumedh
    Salazar, Grecia
    Tran, Huong T.
    Peralta, Jodilyn
    Tan, Clayton
    Manjunath, Deeksha
    Singht, Jaspiar
    Zitkovich, Brianna
    Jackson, Tomas
    Rao, Kanishka
    Finn, Chelsea
    Levine, Sergey
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [27] CAR-Transformer: Cross-Attention Reinforcement Transformer for Cross-Lingual Summarization
    Cai, Yuang
    Yuan, Yuyu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17718 - 17726
  • [28] Causal Discovery by Graph Attention Reinforcement Learning
    Yang, Dezhi
    Yu, Guoxian
    Wang, Jun
    Yan, Zhongmin
    Guo, Maozu
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 28 - 36
  • [29] Learning by neurones: role of attention, reinforcement and behaviour
    Sara, SJ
    COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE III-SCIENCES DE LA VIE-LIFE SCIENCES, 1998, 321 (2-3): : 193 - 198
  • [30] Generalized attention-weighted reinforcement learning
    Bramlage, Lennart
    Cortese, Aurelio
    NEURAL NETWORKS, 2022, 145 : 10 - 21