Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction

被引:0
|
作者
Liu, Tong [1 ,2 ]
Zhang, Yingjie [1 ,2 ]
Zhao, Zhe [3 ]
Dong, Yinpeng [3 ,4 ]
Meng, Guozhu [1 ,2 ]
Chen, Kai [1 ,2 ]
机构
[1] Institute of Information Engineering, Chinese Academy of Sciences, China
[2] School of Cyber Security, University of Chinese Academy of Sciences, China
[3] RealAI
[4] Tsinghua University, China
基金
中国国家自然科学基金;
关键词
Black boxes - Closed source - Fine designs - Fine tuning - Language model - Model security - Open-source - Reconstruction attacks - Source models - Theoretical foundations;
D O I
暂无
中图分类号
学科分类号
摘要
53
引用
收藏
页码:4711 / 4728
相关论文
共 1 条
  • [1] Jailbreaking Black Box Large Language Models in Twenty Queries
    Chao, Patrick
    Robey, Alexander
    Dobriban, Edgar
    Hassani, Hamed
    Pappas, George J.
    Wong, Eric
    arXiv, 2023,