Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

被引:0
|
作者
Li, Dongfang [1 ]
Hu, Baotian [1 ]
Chen, Qingcai [1 ,2 ]
Xu, Tujie [1 ]
Tao, Jingcong [1 ]
Zhang, Yunan [1 ]
机构
[1] Harbin Inst Technol Shenzhen, Shenzhen, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works have shown explainability and robustness are two crucial ingredients of trustworthy and reliable text classification. However, previous works usually address one of two aspects: i) how to extract accurate rationales for explainability while being beneficial to prediction; how to make the predictive model robust to different types of adversarial attacks. Intuitively, a model that produces helpful explanations should be more robust against adversarial attacks, because we cannot trust the model that outputs explanations but changes its prediction under small perturbations. To this end, we propose a joint classification and rationale extraction model named AT-B MC. It includes two key mechanisms: mixed Adversarial Training (AT) is designed to use various perturbations in discrete and embedding space to improve the model's robustness, and Boundary Match Constraint (BMC) helps to locate rationales more precisely with the guidance of boundary information. Performances on benchmark datasets demonstrate that the proposed AT-BMC outperforms baselines on both classification and rationale extraction by a large margin. Robustness analysis shows that the proposed AT-BMC decreases the attack success rate effectively by up to 69%. The empirical results indicate that there are connections between robust models and better explanations.
引用
收藏
页码:10947 / 10955
页数:9
相关论文
共 50 条
  • [1] Xpression: A Unifying Metric to Optimize Compression and Explainability Robustness of AI Models
    Arazo, Eric
    Stoev, Hristo
    Bosch, Cristian
    Suarez-Cetrulo, Andres L.
    Simon-Carbajo, Ricardo
    EXPLAINABLE ARTIFICIAL INTELLIGENCE, PT I, XAI 2024, 2024, 2153 : 370 - 382
  • [2] Robustness and Explainability of Image Classification Based on QCNN
    Chen, Guoming
    Long, Shun
    Yuan, Zeduo
    Li, Wanyi
    Peng, Junfeng
    Quantum Engineering, 2023, 2023
  • [3] Tiny RNN Model with Certified Robustness for Text Classification
    Qiang, Yao
    Kumar, Supriya Tumkur Suresh
    Brocanelli, Marco
    Zhu, Dongxiao
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [4] A Diagnostic Study of Explainability Techniques for Text Classification
    Atanasova, Pepa
    Simonsen, Jakob Grue
    Lioma, Christina
    Augenstein, Isabelle
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3256 - 3274
  • [5] A joint neural model for patent classification and rationale identification
    Koreeda Y.
    Mase H.
    Yanai K.
    Transactions of the Japanese Society for Artificial Intelligence, 2019, 34 (05)
  • [6] A Joint Model for Text and Image Semantic Feature Extraction
    Cao, Jiarun
    Wang, Chongwen
    Gao, Liming
    2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [7] Joint Handwritten Text Recognition and Word Classification for Tabular Information Extraction
    Blomqvist, Christopher
    Enflo, Kerstin
    Jakobsson, Andreas
    Astrom, Kalle
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1564 - 1570
  • [8] A tag based joint extraction model for Chinese medical text
    Liu, XingYu
    Liu, Yu
    Wu, HangYu
    Guan, QingQuan
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2021, 93
  • [9] ADVERSARIAL ROBUSTNESS OF DEEP LEARNING METHODS FOR SAR IMAGE CLASSIFICATION: AN EXPLAINABILITY VIEW
    Chen, Tianrui
    Wu, Juanping
    Guo, Weiwei
    Zhang, Zenghui
    IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, 2024, : 1987 - 1991
  • [10] Counterfactual Fairness in Text Classification through Robustness
    Garg, Sahaj
    Perot, Vincent
    Limtiaco, Nicole
    Taly, Ankur
    Chi, Ed H.
    Beutel, Alex
    AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2019, : 219 - 226