Visually-augmented Pretrained Language Models for NLP Tasks without Images

被引:0
|
作者
Guo, Hangyu [1 ]
Zhou, Kun [3 ,4 ]
Zhao, Wayne Xin [2 ,4 ]
Zhang, Qinyu [1 ]
Wen, Ji-Rong [2 ,3 ,4 ]
机构
[1] Harbin Inst Technol Shenzhen, Sch Elect & Informat Engn, Shenzhen, Peoples R China
[2] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing, Peoples R China
[3] Renmin Univ China, Sch Informat, Beijing, Peoples R China
[4] Beijing Key Lab Big Data Management & Anal Method, Beijing, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although pre-trained language models (PLMs) have shown impressive performance by text-only self-supervised training, they are found lack of visual semantics or commonsense. Existing solutions often rely on explicit images for visual knowledge augmentation (requiring time-consuming retrieval or generation), and they also conduct the augmentation for the whole input text, without considering whether it is actually needed in specific inputs or tasks. To address these issues, we propose a novel Visually-Augmented fine-tuning approach that can be generally applied to various PLMs or NLP tasks, Without using any retrieved or generated Images, namely VAWI. Experimental results show that our approach can consistently improve the performance of BERT, RoBERTa, BART, and T5 at different scales, and outperform several competitive baselines on ten tasks. Our codes and data are publicly available at https://github.com/RUCAIBox/VAWI.
引用
收藏
页码:14912 / 14929
页数:18
相关论文
共 20 条
  • [1] Learning to Imagine: Visually-Augmented Natural Language Generation
    Tang, Tianyi
    Chen, Yushuo
    Du, Yifan
    Li, Junyi
    Zhao, Wayne Xin
    Wen, Ji-Rong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9468 - 9481
  • [2] Evaluation of Pretrained Large Language Models in Embodied Planning Tasks
    Sarkisyan, Christina
    Korchemnyi, Alexandr
    Kovalev, Alexey K.
    Panov, Aleksandr, I
    ARTIFICIAL GENERAL INTELLIGENCE, AGI 2023, 2023, 13921 : 222 - 232
  • [3] Do Pretrained Language Models Indeed Understand Software Engineering Tasks?
    Li, Yao
    Zhang, Tao
    Luo, Xiapu
    Cai, Haipeng
    Fang, Sen
    Yuan, Dawei
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (10) : 4639 - 4655
  • [4] The Use of Clinical Language Models Pretrained on Institutional EHR Data for Downstream Tasks
    Suvirat, Kerdkiat
    Chairat, Sawrawit
    Horsiritham, Kanakorn
    Ingviya, Thammasin
    Kongkamol, Chanon
    Chaichulee, Sitthichok
    2024 21ST INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING, JCSSE 2024, 2024, : 648 - 655
  • [5] Visually Grounded Language Learning: a Review of Language Games, Datasets, Tasks, and Models
    Suglia, Alessandro
    Konstas, Ioannis
    Lemon, Oliver
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 79 : 173 - 239
  • [6] Visually Grounded Language Learning: a Review of Language Games, Datasets, Tasks, and Models
    Suglia A.
    Konstas I.
    Lemon O.
    Journal of Artificial Intelligence Research, 2024, 79 : 173 - 239
  • [7] Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
    Wei, Colin
    Xie, Sang Michael
    Ma, Tengyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [8] Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks
    Hakimov, Sherzod
    Schlangen, David
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 14196 - 14210
  • [9] Can Pretrained English Language Models Benefit Non-English NLP Systems in Low-Resource Scenarios?
    Chi, Zewen
    Huang, Heyan
    Liu, Luyang
    Bai, Yu
    Gao, Xiaoyan
    Mao, Xian-Ling
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1061 - 1074
  • [10] From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
    Feng, Shangbin
    Park, Chan Young
    Liu, Yuhan
    Tsvetkov, Yulia
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11737 - 11762