Learning to Switch off, Switch on, and Integrate Modalities in Large Pre-trained Transformers

被引:0
|
作者
Duseja, Tejas [1 ]
Annervaz, K. M. [1 ]
Duggani, Jeevithiesh [1 ]
Zacharia, Shyam [2 ]
Free, Michael [3 ]
Dukkipati, Ambedkar [1 ]
机构
[1] Indian Inst Sci, Bengaluru, India
[2] British Telcom, Bengaluru, India
[3] British Telcom, London, England
来源
2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024 | 2024年
关键词
Multi-modal emotion recognition; sentiment analysis; pre-trained models;
D O I
10.1109/MIPR62202.2024.00070
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer models that revolutionized foundation models are ubiquitous nowadays. Hence, there has been a surge in pre-trained transformers that can be fine-tuned to perform different downstream tasks. Most pre-trained transformers are trained only on a single modality, and there is no direct way to fine-tune them in multiple modalities. To tackle this issue, in this paper, we propose a general-purpose gate, SSIM (Switch off, Switch on, and Integrate Modalities), by which one can integrate other modalities into large pre-trained language transformers. The proposed SSIM gate helps to obtain the unified representation by soft-switching between multi-modal interactions. To evaluate our approach, we have established benchmarks using pre-trained language transformers like BERT, XLNet, and T5 on multi-modal tasks such as Sentiment and Emotion analysis (CMU-MOSI, CMU-MOSEI), Emotion Recognition in Conversations (IEMOCAP, MELD), and Multimodal Intent Recognition (MIntRec), achieving close to State-of-the-art results.
引用
收藏
页码:403 / 409
页数:7
相关论文
共 50 条
  • [21] Sparse Pairwise Re-ranking with Pre-trained Transformers
    Gienapp, Lukas
    Froebe, Maik
    Hagen, Matthias
    Potthast, Martin
    PROCEEDINGS OF THE 2022 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2022, 2022, : 250 - 258
  • [22] DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering
    Cao, Qingqing
    Trivedi, Harsh
    Balasubramanian, Aruna
    Balasubramanian, Niranjan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4487 - 4497
  • [23] Towards Summarizing Code Snippets Using Pre-Trained Transformers
    Mastropaolo, Antonio
    Tufano, Rosalia
    Ciniselli, Matteo
    Aghajani, Emad
    Pascarella, Luca
    Bavota, Gabriele
    arXiv, 1600,
  • [24] Routing Generative Pre-Trained Transformers for Printed Circuit Board
    Wang, Hao
    Tu, Jun
    Bai, Shenglong
    Zheng, Jie
    Qian, Weikang
    Chen, Jienan
    2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 160 - 165
  • [25] Investor's ESG tendency probed by pre-trained transformers
    Li, Chao
    Keeley, Alexander Ryota
    Takeda, Shutaro
    Seki, Daikichi
    Managi, Shunsuke
    CORPORATE SOCIAL RESPONSIBILITY AND ENVIRONMENTAL MANAGEMENT, 2025, 32 (02) : 2051 - 2071
  • [26] TWilBert: Pre-trained deep bidirectional transformers for Spanish Twitter
    Gonzalez, Jose Angel
    Hurtado, Lluis-F.
    Pla, Ferran
    NEUROCOMPUTING, 2021, 426 : 58 - 69
  • [27] An Empirical Study of Pre-trained Transformers for Arabic Information Extraction
    Lan, Wuwei
    Chen, Yang
    Xu, Wei
    Ritter, Alan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4727 - 4734
  • [28] Causal Interpretation of Self-Attention in Pre-Trained Transformers
    Rohekar, Raanan Y.
    Gurwicz, Yaniv
    Nisimov, Shami
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [29] DRVMon-VM: Distracted driver recognition using large pre-trained video transformers
    Pizarro, Ricardo
    Bergasa, Luis M.
    Baumela, Luis
    Buenaposada, Jose M.
    Barea, Rafael
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1901 - 1906
  • [30] Handwritten Document Recognition Using Pre-trained Vision Transformers
    Parres, Daniel
    Anitei, Dan
    Paredes, Roberto
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT II, 2024, 14805 : 173 - 190