ROME: Evaluating Pre-trained Vision-Language Models on Reasoning beyond Visual Common Sense

被引:0
|
作者
Zhou, Kankan [1 ]
Lai, Eason [1 ]
Yeong, Wei Bin Au [1 ]
Mouratidis, Kyriakos [1 ]
Jiang, Jing [1 ]
机构
[1] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
引用
收藏
页码:10185 / 10197
页数:13
相关论文
共 50 条
  • [1] Compressing and Debiasing Vision-Language Pre-Trained Models for Visual Question Answering
    Si, Qingyi
    Liu, Yuanxin
    Lin, Zheng
    Fu, Peng
    Cao, Yanan
    Wang, Weiping
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 513 - 529
  • [2] Universal Adversarial Perturbations for Vision-Language Pre-trained Models
    Zhang, Peng-Fei
    Huang, Zi
    Bai, Guangdong
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 862 - 871
  • [3] CPT: Colorful Prompt Tuning for pre-trained vision-language models
    Yao, Yuan
    Zhang, Ao
    Zhang, Zhengyan
    Liu, Zhiyuan
    Chua, Tat-Seng
    Sun, Maosong
    AI OPEN, 2024, 5 : 30 - 38
  • [4] Multimodal Search on Iconclass using Vision-Language Pre-Trained Models
    Santini, Cristian
    Posthumus, Etienne
    Tietz, Tabea
    Tan, Mary Ann
    Bruns, Oleksandra
    Sack, Harald
    2023 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, JCDL, 2023, : 285 - 287
  • [5] Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
    Wu, Qiong
    Yu, Wei
    Zhou, Yiyi
    Huang, Shubin
    Sun, Xiaoshuai
    Ji, Rongrong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models
    Wu, Haoyuan
    Zhang, Xinyun
    Xu, Peng
    Liao, Peiyu
    Yao, Xufeng
    Yu, Bei
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6003 - 6011
  • [7] Robotic Applications of Pre-Trained Vision-Language Models to Various Recognition Behaviors
    Kawaharazuka, Kento
    Obinata, Yoshiki
    Kanazawa, Naoaki
    Okada, Kei
    Inaba, Masayuki
    2023 IEEE-RAS 22ND INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, HUMANOIDS, 2023,
  • [8] Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models
    Zheng, Kecheng
    Wu, Wei
    Feng, Ruili
    Zhu, Kai
    Liu, Jiawei
    Zhao, Deli
    Zha, Zheng-Jun
    Chen, Wei
    Shen, Yujun
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11629 - 11639
  • [9] VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models
    Yin, Ziyi
    Ye, Muchao
    Zhang, Tianrong
    Du, Tianyu
    Zhu, Jinguo
    Liu, Han
    Chen, Jinghui
    Wang, Ting
    Ma, Fenglong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Open-World Object Manipulation using Pre-Trained Vision-Language Models
    Stone, Austin
    Xiao, Ted
    Lu, Yao
    Gopalakrishnan, Keerthana
    Lee, Kuang-Huei
    Quan Vuong
    Wohlhart, Paul
    Kirmani, Sean
    Zitkovich, Brianna
    Xia, Fei
    Finn, Chelsea
    Hausman, Karol
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229