Adapting Generative Pre-trained Language Model for Open-domain Multimodal Sentence Summarization

被引:6
|
作者
Lin, Dengtian [1 ]
Jing, Liqiang [1 ]
Song, Xuemeng [1 ]
Liu, Meng [2 ]
Sun, Teng [1 ]
Nie, Liqiang [3 ]
机构
[1] Shandong Univ, Jinan, Peoples R China
[2] Shandong Jianzhu Univ, Jinan, Peoples R China
[3] Harbin Inst Technol Shenzhen, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal Summarization; Pre-trained Language Model; Prompt Learning;
D O I
10.1145/3539618.3591633
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multimodal sentence summarization, aiming to generate a brief summary of the source sentence and image, is a new yet challenging task. Although existing methods have achieved compelling success, they still suffer from two key limitations: 1) lacking the adaptation of generative pre-trained language models for open-domain MMSS, and 2) lacking the explicit critical information modeling. To address these limitations, we propose a BART-MMSS framework, where BART is adopted as the backbone. To be specific, we propose a prompt-guided image encoding module to extract the source image feature. It leverages several soft to-be-learned prompts for image patch embedding, which facilitates the visual content injection to BART for open-domain MMSS tasks. Thereafter, we devise an explicit source critical token learning module to directly capture the critical tokens of the source sentence with the reference of the source image, where we incorporate explicit supervision to improve performance. Extensive experiments on a public dataset fully validate the superiority of our proposed method. In addition, the predicted tokens by the vision-guided key-token highlighting module can be easily understood by humans and hence improve the interpretability of our model.
引用
收藏
页码:195 / 204
页数:10
相关论文
共 50 条
  • [41] Somun: entity-centric summarization incorporating pre-trained language models
    Emrah Inan
    Neural Computing and Applications, 2021, 33 : 5301 - 5311
  • [42] Abstractive Summarization of Korean Legal Cases using Pre-trained Language Models
    Yoon, Jiyoung
    Junaid, Muhammad
    Ali, Sajid
    Lee, Jongwuk
    PROCEEDINGS OF THE 2022 16TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2022), 2022,
  • [43] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
    Fang Li
    Hang Shen
    Jieai Mai
    Tianjing Wang
    Yuanfei Dai
    Xiaodong Miao
    Peer-to-Peer Networking and Applications, 2024, 17 : 227 - 245
  • [44] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
    Li, Fang
    Shen, Hang
    Mai, Jieai
    Wang, Tianjing
    Dai, Yuanfei
    Miao, Xiaodong
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2024, 17 (01) : 227 - 245
  • [45] Surgicberta: a pre-trained language model for procedural surgical language
    Bombieri, Marco
    Rospocher, Marco
    Ponzetto, Simone Paolo
    Fiorini, Paolo
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (01) : 69 - 81
  • [46] Hindi Abstractive Text Summarization using Transliteration with Pre-trained Model
    Kumar, Jeetendra
    Shekhar, Shashi
    Gupta, Rashmi
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 2089 - 2110
  • [47] Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages
    MKSSS Cummins College of Engineering for Women, Maharashtra, Pune, India
    不详
    不详
    CEUR Workshop Proc., (427-434):
  • [48] Pre-trained Language Models in Biomedical Domain: A Systematic Survey
    Wang, Benyou
    Xie, Qianqian
    Pei, Jiahuan
    Chen, Zhihong
    Tiwari, Prayag
    Li, Zhao
    Fu, Jie
    ACM COMPUTING SURVEYS, 2024, 56 (03)
  • [49] DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization
    Li, Yu
    Peng, Baolin
    He, Pengcheng
    Galley, Michel
    Yu, Zhou
    Gao, Jianfeng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1368 - 1386
  • [50] A Joint Summarization and Pre-Trained Model for Review-Based Recommendation
    Bai, Yi
    Li, Yang
    Wang, Letian
    INFORMATION, 2021, 12 (06)