Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions

被引:0
|
作者
Wang, Yibin [1 ]
Yang, Yichen [1 ]
He, Di [2 ]
He, Kun [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[2] Peking Univ, Sch Intelligence Sci & Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing (NLP) models have gained great success on clean texts, but they are known to be vulnerable to adversarial examples typically crafted by synonym substitutions. In this paper, we target to solve this problem and find that word embedding is important to the certified robustness of NLP models. Given the findings, we propose the Embedding Interval Bound Constraint (EIBC) triplet loss to train robustness-aware word embeddings for better certified robustness. We optimize the EIBC triplet loss to reduce distances between synonyms in the embedding space, which is theoretically proven to make the verification boundary tighter. Meanwhile, we enlarge distances among non-synonyms, maintaining the semantic representation of word embeddings. Our method is conceptually simple and componentized. It can be easily combined with IBP training and improves the certified robust accuracy from 76.73% to 84.78% on the IMDB dataset. Experiments demonstrate that our method outperforms various state-of-the-art certified defense baselines and generalizes well to unseen substitutions. The code is available at https://github.com/JHL-HUST/EIBC-IBP/.
引用
收藏
页码:673 / 687
页数:15
相关论文
共 50 条
  • [41] Improving Classification Robustness for Noisy Texts with Robust Word Vectors
    Malykh V.
    Lyalin V.
    Journal of Mathematical Sciences, 2023, 273 (4) : 605 - 613
  • [42] Acquisition and change: On the robustness of the triggering experience for word order cues
    Westergaard, Marit
    LINGUA, 2008, 118 (12) : 1841 - 1863
  • [43] Word-Level Textual Adversarial Attack in the Embedding Space
    Zhu, Bin
    Gu, Zhaoquan
    Xie, Yushun
    Wu, Danni
    Qian, Yaguan
    Wang, Le
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [44] Label-Adversarial Jointly Trained Acoustic Word Embedding
    Li, Zhaoqi
    Li, Ta
    Zhao, Qingwei
    Zhang, Pengyuan
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (08) : 1501 - 1505
  • [45] Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples
    Mozes, Maximilian
    Stenetorp, Pontus
    Kleinberg, Bennett
    Griffin, Lewis D.
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 171 - 186
  • [46] Improving the robustness of LSTMs for word classification using stressed word endings in dual-state word-beam search
    Ameryan, Mahya
    Schomaker, Lambert
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 13 - 18
  • [47] Structure-Aware Stabilization of Adversarial Robustness with Massive Contrastive Adversaries
    Yang, Shuo
    Feng, Zeyu
    Du, Pei
    Du, Bo
    Xu, Chang
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 807 - 816
  • [48] A Robustness-Aware Real-Time SFC Routing Update Scheme in Multi-Tenant Clouds
    Tu, Huaqing
    Zhao, Gongming
    Xu, Hongli
    Zhao, Yangming
    Zhai, Yutong
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2022, 30 (03) : 1230 - 1244
  • [49] Adversarial Framework with Certified Robustness for Time-Series Domain via Statistical Features
    Belkhouja, Taha
    Doppa, Janardhan Rao
    Journal of Artificial Intelligence Research, 2022, 73 : 1435 - 1471
  • [50] Adversarial Framework with Certified Robustness for Time-Series Domain via Statistical Features
    Belkhouja, Taha
    Doppa, Janardhan Rao
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 73 : 1435 - 1471