CBAs: Character-level Backdoor Attacks against Chinese Pre-trained Language Models

被引：0

作者：

He, Xinyu ^{[1
]}

Hao, Fengrui ^{[2
]}

Gu, Tianlong ^{[2
]}

Chang, Liang ^{[1
]}

机构：

[1] Guilin Univ Elect Technol, Guilin, Guangxi, Peoples R China

[2] Jinan Univ, Engn Res Ctr Trustworthy AI, Guangzhou, Guangdong, Peoples R China

来源：

ACM TRANSACTIONS ON PRIVACY AND SECURITY | 2024年 / 27卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Pre-trained language models; backdoor attacks; Chinese; character;

D O I：

10.1145/3678007

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Pre-trained language models (PLMs) aim to assist computers in various domains to provide natural and efficient language interaction and text processing capabilities. However, recent studies have shown that PLMs are highly vulnerable to malicious backdoor attacks, where triggers could be injected into the models to guide them to exhibit the expected behavior of the attackers. Unfortunately, existing research on backdoor attacks has mainly focused on English PLMs and paid less attention to Chinese PLMs. Moreover, these extant backdoor attacks do not work well against Chinese PLMs. In this article, we disclose the limitations of English backdoor attacks against Chinese PLMs, and propose the character-level backdoor attacks (CBAs) against the Chinese PLMs. Specifically, we first design three Chinese trigger generation strategies to ensure that the backdoor is effectively triggered while improving the effectiveness of the backdoor attacks. Then, based on the attacker's capabilities of accessing the training dataset, we develop trigger injection mechanisms with either the target label similarity or the masked language model, which select the most influential position and insert the trigger to maximize the stealth of backdoor attacks. Extensive experiments on three major natural language processing tasks in various Chinese PLMs and English PLMs demonstrate the effectiveness and stealthiness of our method. In addition, CBAs have very strong resistance against three state-of-the-art backdoor defense methods.

引用

页数：26

共 50 条

[1] UOR: Universal Backdoor Attacks on Pre-trained Language Models
Du, Wei
Li, Peixuan
Zhao, Haodong
Ju, Tianjie
Ren, Ge
Liu, Gongshen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 7865 - 7877
[2] Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling
Wang, Yuxuan
Lei, Zhilin
Che, Wanxiang
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (12) : 3503 - 3515
[3] Aliasing Backdoor Attacks on Pre-trained Models
Wei, Cheng'an
Lee, Yeonjoon
Chen, Kai
Meng, Guozhu
Lv, Peizhuo
PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 2707 - 2724
[4] Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling
Yuxuan Wang
Zhilin Lei
Wanxiang Che
International Journal of Machine Learning and Cybernetics, 2021, 12 : 3503 - 3515
[5] Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
Xi, Zhaohan
Du, Tianyu
Li, Changjiang
Pang, Ren
Ji, Shouling
Chen, Jinghui
Ma, Fenglong
Wang, Ting
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[6] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
Wang, Shuo
Nepal, Surya
Rudolph, Carsten
Grobler, Marthie
Chen, Shangyu
Chen, Tianle
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539
[7] Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Li, Linyang
Song, Demin
Li, Xiaonan
Zeng, Jiehang
Ma, Ruotian
Qiu, Xipeng
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3023 - 3032
[8] Enhancing pre-trained language models with Chinese character morphological knowledge
Zheng, Zhenzhong
Wu, Xiaoming
Liu, Xiangzhi
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
[9] Maximum Entropy Loss, the Silver Bullet Targeting Backdoor Attacks in Pre-trained Language Models
Liu, Zhengxiao
Shen, Bowen
Lin, Zheng
Wang, Fali
Wang, Weiping
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3850 - 3868
[10] Multi-target Backdoor Attacks for Code Pre-trained Models
Li, Yanzhou
Liu, Shangqing
Chen, Kangjie
Xie, Xiaofei
Zhang, Tianwei
Liu, Yang
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7236 - 7254

← 1 2 3 4 5 →