Data Stealing Attacks against Large Language Models via Backdooring

被引:1
|
作者
He, Jiaming [1 ]
Hou, Guanyu [1 ]
Jia, Xinyue [1 ]
Chen, Yangyang [1 ]
Liao, Wenqi [1 ]
Zhou, Yinhang [2 ]
Zhou, Rang [1 ]
机构
[1] Chengdu Univ Technol, Coll Comp Sci & Cyber Secur, Oxford Brookes Coll, Chengdu 610059, Peoples R China
[2] Shenyang Normal Univ, Software Coll, Shenyang 110034, Peoples R China
关键词
data privacy; large language models; stealing attacks;
D O I
10.3390/electronics13142858
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring. Our method mainly focuses on the scenario of model customization and is conducted in two phases, including backdoor training and backdoor activation, which allow for the extraction of private information without prior knowledge of the model's architecture or training data. During the model customization stage, attackers inject the backdoor into the pre-trained LLM by poisoning a small ratio of the training dataset. During the inference stage, attackers can extract private information from the third-party knowledge database by incorporating the pre-defined backdoor trigger. Our method leverages the customization process of LLMs, injecting a stealthy backdoor that can be triggered after deployment to retrieve private data. We demonstrate the effectiveness of our proposed attack through extensive experiments, achieving a notable attack success rate. Extensive experiments demonstrate the effectiveness of our stealing attack in popular LLM architectures, as well as stealthiness during normal inference.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Spectrum Stealing via Sybil Attacks in DSA Networks: Implementation and Defense
    Tan, Yi
    Hong, Kai
    Sengupta, Shamik
    Subbalakshmi, K. P.
    2011 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2011,
  • [42] Customization of Closed Captions via Large Language Models
    Chavez, Mariana Arroyo
    Thompson, Bernard
    Feanny, Molly
    Alabi, Kafayat
    Kim, Minchan
    Ming, Lu
    Glasser, Abraham
    Kushalnagar, Raja
    Vogler, Christian
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PT II, ICCHP 2024, 2024, 14751 : 50 - 58
  • [43] Detoxifying Large Language Models via Knowledge Editing
    Wang, Mengru
    Zhang, Ningyu
    Xu, Ziwen
    Xi, Zekun
    Deng, Shumin
    Yao, Yunzhi
    Zhang, Qishen
    Yang, Linyi
    Wang, Jindong
    Chen, Huajun
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 3093 - 3118
  • [44] Trend Extraction and Analysis via Large Language Models
    Soru, Tommaso
    Marshall, Jim
    18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 285 - 288
  • [45] Attention-based backdoor attacks against natural language processing models
    Zhang, Yunchun
    Wang, Qi
    Min, Shaohui
    Zuo, Ruifeng
    Huang, Feiyang
    Liu, Hao
    Yao, Shaowen
    APPLIED SOFT COMPUTING, 2025, 173
  • [46] ChatTwin: Toward Automated Digital Twin Generation for Data Center via Large Language Models
    Li, Minghao
    Wang, Ruihang
    Zhou, Xin
    Zhu, Zhaomeng
    Wen, Yonggang
    Tan, Rui
    PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2023, 2023, : 208 - 211
  • [47] CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions
    Rao, Jun
    Liu, Xuebo
    Lian, Lian
    Cheng, Shengjun
    Liao, Yunjie
    Zhang, Min
    EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2024, : 10064 - 10083
  • [48] Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
    Ozdayi, Mustafa Safa
    Peris, Charith
    Fitzgerald, Jack
    Dupuy, Christophe
    Majmudar, Jimit
    Khan, Haidar
    Parikh, Rahil
    Gupta, Rahul
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1512 - 1521
  • [49] Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data
    Kim, Yubin
    Xu, Xuhai
    McDuff, Daniel
    Breazeal, Cynthia
    Park, Hae Won
    CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, 2024, 248 : 522 - 539
  • [50] Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges
    Kumar, Pranjal
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)