Foot In The Door: Understanding Large Language Model Jailbreaking via Cognitive Psychology

被引:0
|
作者
National University of Defense Technology, China [1 ]
不详 [2 ]
机构
来源
关键词
Compilation and indexing terms; Copyright 2024 Elsevier Inc;
D O I
暂无
中图分类号
学科分类号
摘要
'current - Black boxes - Cognitive psychology - Consistency theory - Decision-making mechanisms - Language model - Model security - Multisteps - Psychological explanation - Security protection
引用
收藏
相关论文
共 50 条
  • [1] Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking
    Xu, Nan
    Wang, Fei
    Zhou, Ben
    Li, Bangzheng
    Xiao, Chaowei
    Chen, Muhao
    Findings of the Association for Computational Linguistics: NAACL 2024 - Findings, 2024, : 3526 - 3548
  • [2] Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction
    Liu, Tong
    Zhang, Yingjie
    Zhao, Zhe
    Dong, Yinpeng
    Meng, Guozhu
    Chen, Kai
    Proceedings of the 33rd USENIX Security Symposium, 2024, : 4711 - 4728
  • [3] Bootstrapping Cognitive Agents with a Large Language Model
    Zhu, Feiyu
    Simmons, Reid
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 655 - 663
  • [4] Evaluating Large Language Model Understanding of Due Process
    Johnson, Joshua P.
    Lauf, Adrian P.
    2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
  • [5] LongVLM: Efficient Long Video Understanding via Large Language Models
    Weng, Yuetian
    Han, Mingfei
    He, Haoyu
    Chang, Xiaojun
    Zhuang, Bohan
    COMPUTER VISION - ECCV 2024, PT XXXIII, 2025, 15091 : 453 - 470
  • [6] Flight Arrival Scheduling via Large Language Model
    Zhou, Wentao
    Wang, Jinlin
    Zhu, Longtao
    Wang, Yi
    Ji, Yulong
    AEROSPACE, 2024, 11 (10)
  • [7] Capturing Failures of Large Language Models via Human Cognitive Biases
    Jones, Erik
    Steinhardt, Jacob
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Exploring Vision Language Pretraining with Knowledge Enhancement via Large Language Model
    Tung, Chuenyuet
    Lin, Yi
    Yin, Jianing
    Ye, Qiaoyuchen
    Chen, Hao
    TRUSTWORTHY ARTIFICIAL INTELLIGENCE FOR HEALTHCARE, TAI4H 2024, 2024, 14812 : 81 - 91
  • [9] Pathologyvlm: a large vision-language model for pathology image understanding
    Dawei Dai
    Yuanhui Zhang
    Qianlan Yang
    Long Xu
    Xiaojing Shen
    Shuyin Xia
    Guoyin Wang
    Artificial Intelligence Review, 58 (6)
  • [10] FashionGPT: A Large Vision-Language Model for Enhancing Fashion Understanding
    Song, Duanxiao
    Gao, Dehong
    Liu, Gongshen
    Li, Xiaoyong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT V, 2024, 15020 : 308 - 323