Somun: entity-centric summarization incorporating pre-trained language models

被引：3

作者：

Inan, Emrah ^{[1
]}

机构：

[1] Univ Manchester, Sch Comp Sci, Natl Ctr Text Min, Manchester, Lancs, England

来源：

NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 10期

关键词：

Automatic text summarization; Language models; Harmonic centrality; FEATURE-EXTRACTION; CENTRALITY;

D O I：

10.1007/s00521-020-05319-2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text summarization resolves the issue of capturing essential information from a large volume of text data. Existing methods either depend on the end-to-end models or hand-crafted preprocessing steps. In this study, we propose an entity-centric summarization method which extracts named entities and produces a small graph with a dependency parser. To extract entities, we employ well-known pre-trained language models. After generating the graph, we perform the summarization by ranking entities using the harmonic centrality algorithm. Experiments illustrate that we outperform the state-of-the-art unsupervised learning baselines by improving the performance more than 10% for ROUGE-1 and more than 50% for ROUGE-2 scores. Moreover, we achieve comparable results to recent end-to-end models.

引用

页码：5301 / 5311

页数：11

共 50 条

[41] ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
Qin, Yujia
Lin, Yankai
Takanobu, Ryuichi
Liu, Zhiyuan
Li, Peng
Ji, Heng
Huang, Minlie
Sun, Maosong
Zhou, Jie
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3350 - 3363
[42] Probing Pre-trained Auto-regressive Language Models for Named Entity Typing and Recognition
Epure, Elena V.
Hennequin, Romain
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1408 - 1417
[43] TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models
Davody, Ali
Adelani, David Ifeoluwa
Kleinbauer, Thomas
Klakow, Dietrich
TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 138 - 150
[44] A Study of Pre-trained Language Models in Natural Language Processing
Duan, Jiajia
Zhao, Hui
Zhou, Qian
Qiu, Meikang
Liu, Meiqin
2020 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2020), 2020, : 116 - 121
[45] An Opinion Summarization-Evaluation System Based on Pre-trained Models
Jiang, Han
Wang, Yubin
Lv, Songhao
Wei, Zhihua
ROUGH SETS (IJCRS 2021), 2021, 12872 : 225 - 230
[46] Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Matsuura, Kohei
Ashihara, Takanori
Moriya, Takafumi
Tanaka, Tomohiro
Kano, Takatomo
Ogawa, Atsunori
Delcroix, Marc
INTERSPEECH 2023, 2023, : 2943 - 2947
[47] From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
Xu, Weiwen
Li, Xin
Zhang, Wenxuan
Zhou, Meng
Lam, Wai
Si, Luo
Bing, Lidong
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[48] Pre-trained models for natural language processing: A survey
Qiu XiPeng
Sun TianXiang
Xu YiGe
Shao YunFan
Dai Ning
Huang XuanJing
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897
[49] Probing Pre-Trained Language Models for Disease Knowledge
Alghanmi, Israa
Espinosa-Anke, Luis
Schockaert, Steven
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3023 - 3033
[50] Analyzing Individual Neurons in Pre-trained Language Models
Durrani, Nadir
Sajjad, Hassan
Dalvi, Fahim
Belinkov, Yonatan
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4865 - 4880

← 1 2 3 4 5 →