Images are Achilles’ Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models

被引:0
|
作者
Li, Yifan [1 ,3 ]
Guo, Hangyu [1 ,3 ]
Zhou, Kun [2 ,3 ]
Zhao, Wayne Xin [1 ,3 ]
Wen, Ji-Rong [1 ,2 ,3 ]
机构
[1] Gaoling School of Artificial Intelligence, Renmin University of China, China
[2] School of Information, Renmin University of China, China
[3] Beijing Key Laboratory of Big Data Management and Analysis Methods, China
来源
关键词
Compilation and indexing terms; Copyright 2025 Elsevier Inc;
D O I
暂无
中图分类号
学科分类号
摘要
Alignment - Problem oriented languages
引用
收藏
相关论文
共 50 条
  • [1] Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models
    Li, Yifan
    Guo, Hangyu
    Zhou, Kun
    Zhou, Wayne Xin
    Wen, Ji-Rong
    COMPUTER VISION - ECCV 2024, PT LXXIII, 2025, 15131 : 174 - 189
  • [2] Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models
    Yang, Hao
    Qu, Lizhen
    Shareghi, Ehsan
    Haffari, Gholamreza
    arXiv,
  • [3] Visual cognition in multimodal large language models
    Buschoff, Luca M. Schulze
    Akata, Elif
    Bethge, Matthias
    Schulz, Eric
    NATURE MACHINE INTELLIGENCE, 2025, 7 (01) : 96 - 106
  • [4] Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models
    Hao, Shuyang
    Hooi, Bryan
    Liu, Jun
    Chang, Kai-Wei
    Huang, Zi
    Cai, Yujun
    arXiv,
  • [5] EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
    Zhou, Weikang
    Wang, Xiao
    Xiong, Limao
    Xia, Han
    Gu, Yingshuang
    Chai, Mingxu
    Zhu, Fukang
    Huang, Caishuang
    Dou, Shihan
    Xi, Zhiheng
    Zheng, Rui
    Gao, Songyang
    Zou, Yicheng
    Yan, Hang
    Le, Yifan
    Wang, Ruohui
    Li, Lijun
    Shao, Jing
    Gui, Tao
    Zhang, Qi
    Huang, Xuanjing
    arXiv,
  • [6] Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
    Chen, Yiming
    Zhang, Chen
    Luo, Danqing
    D'Haro, Luis Fernando
    Tan, Robby T.
    Li, Haizhou
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1359 - 1375
  • [7] Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
    Zhang, Yichi
    Dong, Yinpeng
    Zhang, Siyuan
    Min, Tianzan
    Su, Hang
    Zhu, Jun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 26552 - 26562
  • [8] Jailbreaking Black Box Large Language Models in Twenty Queries
    Chao, Patrick
    Robey, Alexander
    Dobriban, Edgar
    Hassani, Hamed
    Pappas, George J.
    Wong, Eric
    arXiv, 2023,
  • [9] Unleashing the Unseen: Harnessing Benign Datasets for Jailbreaking Large Language Models
    Zhao, Wei
    Li, Zhe
    Li, Yige
    Sun, Jun
    arXiv,
  • [10] Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
    Jia, Xiaojun
    Pang, Tianyu
    Du, Chao
    Huang, Yihao
    Gu, Jindong
    Liu, Yang
    Cao, Xiaochun
    Lin, Min
    arXiv,