Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction

被引：0

作者：

Liu, Tong ^{[1
,2
]}

Zhang, Yingjie ^{[1
,2
]}

Zhao, Zhe ^{[3
]}

Dong, Yinpeng ^{[3
,4
]}

Meng, Guozhu ^{[1
,2
]}

Chen, Kai ^{[1
,2
]}

机构：

[1] Institute of Information Engineering, Chinese Academy of Sciences, China

[2] School of Cyber Security, University of Chinese Academy of Sciences, China

[3] RealAI

[4] Tsinghua University, China

来源：

Proceedings of the 33rd USENIX Security Symposium | 2024年

基金：

中国国家自然科学基金;

关键词：

Black boxes - Closed source - Fine designs - Fine tuning - Language model - Model security - Open-source - Reconstruction attacks - Source models - Theoretical foundations;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

引用

页码：4711 / 4728