Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models

被引:0
|
作者
Yang, Hao [1 ]
Qu, Lizhen [1 ]
Shareghi, Ehsan [1 ]
Haffari, Gholamreza [1 ]
机构
[1] Department of Data Science & AI, Monash University, Australia
来源
关键词
Achilles' heel - Condition - Language model - Multi-modal information - Multimodal inputs - Multimodal models - Non-speech audio - Real-world - Red teaming - Text format;
D O I
暂无
中图分类号
学科分类号
摘要
39
引用
收藏
相关论文
共 50 条
  • [31] Audio for Audio is Better? An Investigation on Transfer Learning Models for Heart Sound Classification
    Koike, Tomoya
    Qian, Kun
    Kong, Qiuqiang
    Plumbley, Mark D.
    Schuller, Bjorn W.
    Yamamoto, Yoshiharu
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 74 - 77
  • [32] Adapting Language-Audio Models as Few-Shot Audio Learners
    Liang, Jinhua
    Liu, Xubo
    Liu, Haohe
    Phan, Huy
    Benetos, Emmanouil
    Plumbley, Mark D.
    Wang, Wenwu
    INTERSPEECH 2023, 2023, : 276 - 280
  • [33] Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
    Chen, Yiming
    Zhang, Chen
    Luo, Danqing
    D'Haro, Luis Fernando
    Tan, Robby T.
    Li, Haizhou
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1359 - 1375
  • [34] A Large-scale Depth-based Multimodal Audio-Visual Corpus in Mandarin
    Wang, Jianrong
    Wang, Liyuan
    Zhang, Ju
    Yu, Mei
    Yu, Ruiguo
    Wei, Jianguo
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 881 - 885
  • [35] ASSERT: Automated Safety Scenario Red Teaming for Evaluating the Robustness of Large Language Models
    Mei, Alex
    Levy, Sharon
    Wang, William Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5831 - 5847
  • [36] Multimodal Object Recognition from Visual and Audio Sequences
    He, Weipeng
    Guan, Haojun
    Zhang, Jianwei
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS (MFI), 2015, : 133 - 138
  • [37] Late multimodal fusion for image and audio music transcription
    Alfaro-Contreras, Maria
    Valero-Mas, Jose J.
    Inesta, Jose M.
    Calvo-Zaragoza, Jorge
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 216
  • [38] Multimodal System for Audio Scene Source Counting and Analysis
    Nigro, Michael
    Krishnan, Sridhar
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1073 - 1082
  • [39] The role of respiration audio in multimodal analysis of movement qualities
    Lussu, Vincenzo
    Niewiadomski, Radoslaw
    Volpe, Gualtiero
    Camurri, Antonio
    JOURNAL ON MULTIMODAL USER INTERFACES, 2020, 14 (01) : 1 - 15
  • [40] MULTIMODAL SPEECH EMOTION RECOGNITION USING AUDIO AND TEXT
    Yoon, Seunghyun
    Byun, Seokhyun
    Jung, Kyomin
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 112 - 118