HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks

被引:0
|
作者
Zhang, Zhengkun [1 ]
Guo, Wenya [1 ]
Meng, Xiaojun [2 ]
Wang, Yasheng [2 ]
Wang, Yadao [2 ]
Jiang, Xin [2 ]
Liu, Qun [2 ]
Yang, Zhenglu [1 ]
机构
[1] Nankai Univ, CS, TKLNDST, Tianjin, Peoples R China
[2] Huawei Technol, Noahs Ark Lab, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
With the scale and capacity of pretrained models growing rapidly, parameter-efficient language model tuning has emerged as a popular paradigm for solving various NLP and Vision-and-Language (V&L) tasks. In this paper, we design a unified parameter-efficient multitask learning framework that works effectively on both NLP and V&L tasks. In particular, we use a shared hypernetwork that takes trainable hyper-embeddings and visual modality as input, and outputs weights for different modules in a pretrained language model, such as the parameters inserted into multi-head attention blocks (i.e., prefix-tuning) and feed-forward blocks (i.e., adapter-tuning.). Our proposed framework adds fewer trainable parameters in multi-task learning while achieving superior performances and transfer ability compared to state-of-the-art methods. Empirical results on the GLUE benchmark and multiple V&L tasks confirm the effectiveness of our framework.
引用
收藏
页码:11442 / 11453
页数:12
相关论文
共 50 条
  • [31] SkyEyeGPT: Unifying remote sensing vision-language tasks via instruction tuning with large language model
    Zhan, Yang
    Xiong, Zhitong
    Yuan, Yuan
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 221 : 64 - 77
  • [32] Episodic Transformer for Vision-and-Language Navigation
    Pashevich, Alexander
    Schmid, Cordelia
    Sun, Chen
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15922 - 15932
  • [33] ADT: An Additive Delta-Tuning approach for parameter-efficient tuning in pre-trained language models
    Li, Dong
    Tang, Jintao
    Li, Shasha
    Wang, Ting
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 382 - 386
  • [34] WebVLN: Vision-and-Language Navigation on Websites
    Chen, Qi
    Pitawela, Dileepa
    Zhao, Chongyang
    Zhou, Gengze
    Chen, Hsiang-Ting
    Wu, Qi
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1165 - 1173
  • [35] RingMoGPT: A Unified Remote Sensing Foundation Model for Vision, Language, and Grounded Tasks
    Wang, Peijin
    Hu, Huiyang
    Tong, Boyuan
    Zhang, Ziqi
    Yao, Fanglong
    Feng, Yingchao
    Zhu, Zining
    Chang, Hao
    Diao, Wenhui
    Ye, Qixiang
    Sun, Xian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [36] Parameter-Efficient Korean Character-Level Language Modeling
    Cognetta, Marco
    Moon, Sangwhan
    Wolf-Sonkin, Lawrence
    Okazaki, Naoaki
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2350 - 2356
  • [37] Parameter-Efficient Conversational Recommender System as a Language Processing Task
    Ravauti, Mathieu
    Zhang, Hao
    Xu, Lu
    Sun, Aixin
    Liu, Yong
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 152 - 165
  • [38] Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models
    Iki, Taichi
    Aizawa, Akiko
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 2189 - 2196
  • [39] Parameter-Efficient Korean Character-Level Language Modeling
    Cognetta, Marco
    Wolf-Sonkin, Lawrence
    Moon, Sangwhan
    Okazaki, Naoaki
    EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, 2023, : 2342 - 2348
  • [40] Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers
    Frank, Stella
    Bugliarello, Emanuele
    Elliott, Desmond
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9847 - 9857