Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

被引：0

作者：

Li, Dongfang ^{[1
]}

Hu, Baotian ^{[1
]}

Chen, Qingcai ^{[1
,2
]}

Xu, Tujie ^{[1
]}

Tao, Jingcong ^{[1
]}

Zhang, Yunan ^{[1
]}

机构：

[1] Harbin Inst Technol Shenzhen, Shenzhen, Peoples R China

[2] Peng Cheng Lab, Shenzhen, Peoples R China

来源：

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent works have shown explainability and robustness are two crucial ingredients of trustworthy and reliable text classification. However, previous works usually address one of two aspects: i) how to extract accurate rationales for explainability while being beneficial to prediction; how to make the predictive model robust to different types of adversarial attacks. Intuitively, a model that produces helpful explanations should be more robust against adversarial attacks, because we cannot trust the model that outputs explanations but changes its prediction under small perturbations. To this end, we propose a joint classification and rationale extraction model named AT-B MC. It includes two key mechanisms: mixed Adversarial Training (AT) is designed to use various perturbations in discrete and embedding space to improve the model's robustness, and Boundary Match Constraint (BMC) helps to locate rationales more precisely with the guidance of boundary information. Performances on benchmark datasets demonstrate that the proposed AT-BMC outperforms baselines on both classification and rationale extraction by a large margin. Robustness analysis shows that the proposed AT-BMC decreases the attack success rate effectively by up to 69%. The empirical results indicate that there are connections between robust models and better explanations.

引用

页码：10947 / 10955

页数：9

共 50 条

[21] Feature Extraction of Deep Topic Model for Multi-label Text Classification
Chen W.
Liu X.
Lu M.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (09): : 785 - 792
[22] Joint Entity and Relation Extraction for Long Text
Cheng, Dong
Song, Hui
He, Xianglong
Xu, Bo
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2021, PT II, 2021, 12816 : 152 - 162
[23] Joint Sentiment/Topic Extraction from Text
Sowmiya, J. S.
Chandrakala, S.
2014 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2014, : 611 - 615
[24] Comparing the Robustness of Classical and Deep Learning Techniques for Text Classification
Quynh Tran
Shpileuskaya, Krystsina
Zaunseder, Elaine
Putzar, Larissa
Blankenburg, Sven
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[25] Applying Text Classification Algorithms in Web Services Robustness Testing
Laranjeiro, Nuno
Oliveira, Rui
Vieira, Marco
2010 29TH IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS SRDS 2010, 2010, : 255 - 264
[26] Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks
Tang, Huidong
Kamei, Sayaka
Morimoto, Yasuhiko
ALGORITHMS, 2023, 16 (01)
[27] Performance Robustness of Feature Extraction for Target Detection & Classification
Smith, Brian M.
Chattopadhyay, Pritthi
Ray, Asok
Phoha, Shashi
Damarla, Thyagaraju
2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 3814 - 3819
[28] Joint Embedding of Words and Labels for Text Classification
Wang, Guoyin
Li, Chunyuan
Wang, Wenlin
Zhang, Yizhe
Shen, Dinghan
Zhang, Xinyuan
Henao, Ricardo
Carin, Lawrence
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2321 - 2331
[29] Joint Extraction Model of Multi-entity Relations for Poultry Diagnosis and Treatment Text
Hu B.
Tang B.
Jiang H.
Huo A.
Han W.
Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (06): : 268 - 276
[30] Does Robustness Improve Fairness? Approaching Fairness withWord Substitution Robustness Methods for Text Classification
Pruksachatkun, Yada
Krishna, Satyapriya
Dhamala, Jwala
Gupta, Rahul
Chang, Kai Wei
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3320 - 3331

← 1 2 3 4 5 →