Somun: entity-centric summarization incorporating pre-trained language models

被引:3
|
作者
Inan, Emrah [1 ]
机构
[1] Univ Manchester, Sch Comp Sci, Natl Ctr Text Min, Manchester, Lancs, England
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 10期
关键词
Automatic text summarization; Language models; Harmonic centrality; FEATURE-EXTRACTION; CENTRALITY;
D O I
10.1007/s00521-020-05319-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization resolves the issue of capturing essential information from a large volume of text data. Existing methods either depend on the end-to-end models or hand-crafted preprocessing steps. In this study, we propose an entity-centric summarization method which extracts named entities and produces a small graph with a dependency parser. To extract entities, we employ well-known pre-trained language models. After generating the graph, we perform the summarization by ranking entities using the harmonic centrality algorithm. Experiments illustrate that we outperform the state-of-the-art unsupervised learning baselines by improving the performance more than 10% for ROUGE-1 and more than 50% for ROUGE-2 scores. Moreover, we achieve comparable results to recent end-to-end models.
引用
收藏
页码:5301 / 5311
页数:11
相关论文
共 50 条
  • [41] ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
    Qin, Yujia
    Lin, Yankai
    Takanobu, Ryuichi
    Liu, Zhiyuan
    Li, Peng
    Ji, Heng
    Huang, Minlie
    Sun, Maosong
    Zhou, Jie
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3350 - 3363
  • [42] Probing Pre-trained Auto-regressive Language Models for Named Entity Typing and Recognition
    Epure, Elena V.
    Hennequin, Romain
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1408 - 1417
  • [43] TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models
    Davody, Ali
    Adelani, David Ifeoluwa
    Kleinbauer, Thomas
    Klakow, Dietrich
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 138 - 150
  • [44] A Study of Pre-trained Language Models in Natural Language Processing
    Duan, Jiajia
    Zhao, Hui
    Zhou, Qian
    Qiu, Meikang
    Liu, Meiqin
    2020 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2020), 2020, : 116 - 121
  • [45] An Opinion Summarization-Evaluation System Based on Pre-trained Models
    Jiang, Han
    Wang, Yubin
    Lv, Songhao
    Wei, Zhihua
    ROUGH SETS (IJCRS 2021), 2021, 12872 : 225 - 230
  • [46] Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
    Matsuura, Kohei
    Ashihara, Takanori
    Moriya, Takafumi
    Tanaka, Tomohiro
    Kano, Takatomo
    Ogawa, Atsunori
    Delcroix, Marc
    INTERSPEECH 2023, 2023, : 2943 - 2947
  • [47] From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
    Xu, Weiwen
    Li, Xin
    Zhang, Wenxuan
    Zhou, Meng
    Lam, Wai
    Si, Luo
    Bing, Lidong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] Pre-trained models for natural language processing: A survey
    Qiu XiPeng
    Sun TianXiang
    Xu YiGe
    Shao YunFan
    Dai Ning
    Huang XuanJing
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897
  • [49] Probing Pre-Trained Language Models for Disease Knowledge
    Alghanmi, Israa
    Espinosa-Anke, Luis
    Schockaert, Steven
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3023 - 3033
  • [50] Analyzing Individual Neurons in Pre-trained Language Models
    Durrani, Nadir
    Sajjad, Hassan
    Dalvi, Fahim
    Belinkov, Yonatan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4865 - 4880