Defense against adversarial attacks via textual embeddings based on semantic associative field

被引:0
|
作者
Huang, Jiacheng [1 ]
Chen, Long [1 ,2 ]
机构
[1] Chongqing Univ Posts & Telecommun, Sch Comp Sci & Technol, Chongqing 400065, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Sch Cyber Secur & Informat Law, Chongqing 400065, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 01期
关键词
Adversarial examples; Natural language processing; Semantic associative field; Word-level;
D O I
10.1007/s00521-023-08946-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks are known to be vulnerable to various types of adversarial attacks, especially word-level attacks, in the field of natural language processing. In recent years, various defense methods are proposed against word-level attacks; however, most of those defense methods only focus on synonyms substitution-based attacks, while word-level attacks are not based on synonym substitution. In this paper, we propose a textual adversarial defense method against word-level adversarial attacks via textual embedding based on the semantic associative field. More specifically, we analyze the reasons why humans can read and understand textual adversarial examples and observe two crucial points: (1) There must be a relation between the original word and the perturbed word or token. (2) Such a kind of relation enables humans to infer original words, while humans have the ability to associations. Motivated by this, we introduce the concept of semantic associative field and propose a new defense method by building a robust word embedding, that is, we calculate the word vector by exerting the related word vector to it with potential function and weighted embedding sampling for simulating the semantic influence between words in same semantic field. We conduct comprehensive experiments and demonstrate that the models using the proposed method can achieve higher accuracy than the baseline defense methods under various adversarial attacks or original testing sets. Moreover, the proposed method is more universal, while it is irrelevant to model structure and will not affect the efficiency of training.
引用
收藏
页码:289 / 301
页数:13
相关论文
共 50 条
  • [41] Watermarking-based Defense against Adversarial Attacks on Deep Neural Networks
    Li, Xiaoting
    Chen, Lingwei
    Zhang, Jinquan
    Larus, James
    Wu, Dinghao
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [42] Instance-based defense against adversarial attacks in Deep Reinforcement Learning
    Garcia, Javier
    Sagredo, Ismael
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 107
  • [43] Transformer Based Defense GAN Against Palm-Vein Adversarial Attacks
    Li, Yantao
    Ruan, Song
    Qin, Huafeng
    Deng, Shaojiang
    El-Yacoubi, Mounim A.
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 1509 - 1523
  • [44] Adversarial Defense via Learning to Generate Diverse Attacks
    Jang, Yunseok
    Zhao, Tianchen
    Hong, Seunghoon
    Lee, Honglak
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2740 - 2749
  • [45] Defending Adversarial Attacks via Semantic Feature Manipulation
    Wang, Shuo
    Nepal, Surya
    Rudolph, Carsten
    Grobler, Marthie
    Chen, Shangyu
    Chen, Tianle
    An, Zike
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (06) : 3184 - 3197
  • [46] ONION: A Simple and Effective Defense Against Textual Backdoor Attacks
    Qi, Fanchao
    Chen, Yangyi
    Li, Mukai
    Yao, Yuan
    Liu, Zhiyuan
    Sun, Maosong
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9558 - 9566
  • [47] Robust Textual Embedding against Word-level Adversarial Attacks
    Yang, Yichen
    Wang, Xiaosen
    He, Kun
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 2214 - 2224
  • [48] Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods
    Bhardwaj, Peru
    Kelleher, John
    Costabello, Luca
    O'Sullivan, Dec
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8225 - 8239
  • [49] ENSEMBLE ADVERSARIAL TRAINING BASED DEFENSE AGAINST ADVERSARIAL ATTACKS FOR MACHINE LEARNING-BASED INTRUSION DETECTION SYSTEM
    Haroon, M. S.
    Ali, H. M.
    NEURAL NETWORK WORLD, 2023, 33 (05) : 317 - 336
  • [50] Towards a Practical Defense Against Adversarial Attacks on Deep Learning-Based Malware Detectors via Randomized Smoothing
    Gibert, Daniel
    Zizzo, Giulio
    Le, Quan
    COMPUTER SECURITY. ESORICS 2023 INTERNATIONAL WORKSHOPS, CPS4CIP, PT II, 2024, 14399 : 683 - 699