Adapting Generative Pre-trained Language Model for Open-domain Multimodal Sentence Summarization

被引：6

作者：

Lin, Dengtian ^{[1
]}

Jing, Liqiang ^{[1
]}

Song, Xuemeng ^{[1
]}

Liu, Meng ^{[2
]}

Sun, Teng ^{[1
]}

Nie, Liqiang ^{[3
]}

机构：

[1] Shandong Univ, Jinan, Peoples R China

[2] Shandong Jianzhu Univ, Jinan, Peoples R China

[3] Harbin Inst Technol Shenzhen, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Multimodal Summarization; Pre-trained Language Model; Prompt Learning;

D O I：

10.1145/3539618.3591633

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multimodal sentence summarization, aiming to generate a brief summary of the source sentence and image, is a new yet challenging task. Although existing methods have achieved compelling success, they still suffer from two key limitations: 1) lacking the adaptation of generative pre-trained language models for open-domain MMSS, and 2) lacking the explicit critical information modeling. To address these limitations, we propose a BART-MMSS framework, where BART is adopted as the backbone. To be specific, we propose a prompt-guided image encoding module to extract the source image feature. It leverages several soft to-be-learned prompts for image patch embedding, which facilitates the visual content injection to BART for open-domain MMSS tasks. Thereafter, we devise an explicit source critical token learning module to directly capture the critical tokens of the source sentence with the reference of the source image, where we incorporate explicit supervision to improve performance. Extensive experiments on a public dataset fully validate the superiority of our proposed method. In addition, the predicted tokens by the vision-guided key-token highlighting module can be easily understood by humans and hence improve the interpretability of our model.

引用

页码：195 / 204

页数：10

共 50 条

[41] Somun: entity-centric summarization incorporating pre-trained language models
Emrah Inan
Neural Computing and Applications, 2021, 33 : 5301 - 5311
[42] Abstractive Summarization of Korean Legal Cases using Pre-trained Language Models
Yoon, Jiyoung
Junaid, Muhammad
Ali, Sajid
Lee, Jongwuk
PROCEEDINGS OF THE 2022 16TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2022), 2022,
[43] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
Fang Li
Hang Shen
Jieai Mai
Tianjing Wang
Yuanfei Dai
Xiaodong Miao
Peer-to-Peer Networking and Applications, 2024, 17 : 227 - 245
[44] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
Li, Fang
Shen, Hang
Mai, Jieai
Wang, Tianjing
Dai, Yuanfei
Miao, Xiaodong
PEER-TO-PEER NETWORKING AND APPLICATIONS, 2024, 17 (01) : 227 - 245
[45] Surgicberta: a pre-trained language model for procedural surgical language
Bombieri, Marco
Rospocher, Marco
Ponzetto, Simone Paolo
Fiorini, Paolo
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (01) : 69 - 81
[46] Hindi Abstractive Text Summarization using Transliteration with Pre-trained Model
Kumar, Jeetendra
Shekhar, Shashi
Gupta, Rashmi
JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 2089 - 2110
[47] Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages
MKSSS Cummins College of Engineering for Women, Maharashtra, Pune, India
不详
不详
CEUR Workshop Proc., (427-434):
[48] Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Wang, Benyou
Xie, Qianqian
Pei, Jiahuan
Chen, Zhihong
Tiwari, Prayag
Li, Zhao
Fu, Jie
ACM COMPUTING SURVEYS, 2024, 56 (03)
[49] DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization
Li, Yu
Peng, Baolin
He, Pengcheng
Galley, Michel
Yu, Zhou
Gao, Jianfeng
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1368 - 1386
[50] A Joint Summarization and Pre-Trained Model for Review-Based Recommendation
Bai, Yi
Li, Yang
Wang, Letian
INFORMATION, 2021, 12 (06)

← 1 2 3 4 5 →