Somun: entity-centric summarization incorporating pre-trained language models

被引:3
|
作者
Inan, Emrah [1 ]
机构
[1] Univ Manchester, Sch Comp Sci, Natl Ctr Text Min, Manchester, Lancs, England
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 10期
关键词
Automatic text summarization; Language models; Harmonic centrality; FEATURE-EXTRACTION; CENTRALITY;
D O I
10.1007/s00521-020-05319-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization resolves the issue of capturing essential information from a large volume of text data. Existing methods either depend on the end-to-end models or hand-crafted preprocessing steps. In this study, we propose an entity-centric summarization method which extracts named entities and produces a small graph with a dependency parser. To extract entities, we employ well-known pre-trained language models. After generating the graph, we perform the summarization by ranking entities using the harmonic centrality algorithm. Experiments illustrate that we outperform the state-of-the-art unsupervised learning baselines by improving the performance more than 10% for ROUGE-1 and more than 50% for ROUGE-2 scores. Moreover, we achieve comparable results to recent end-to-end models.
引用
收藏
页码:5301 / 5311
页数:11
相关论文
共 50 条
  • [1] Somun: entity-centric summarization incorporating pre-trained language models
    Emrah Inan
    Neural Computing and Applications, 2021, 33 : 5301 - 5311
  • [2] Evaluating the Summarization Comprehension of Pre-Trained Language Models
    Chernyshev, D. I.
    Dobrov, B. V.
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2023, 44 (08) : 3028 - 3039
  • [3] Evaluating the Summarization Comprehension of Pre-Trained Language Models
    D. I. Chernyshev
    B. V. Dobrov
    Lobachevskii Journal of Mathematics, 2023, 44 : 3028 - 3039
  • [4] Deep Entity Matching with Pre-Trained Language Models
    Li, Yuliang
    Li, Jinfeng
    Suhara, Yoshihiko
    Doan, AnHai
    Tan, Wang-Chiew
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (01): : 50 - 60
  • [5] Low Resource Summarization using Pre-trained Language Models
    Munaf, Mubashir
    Afzal, Hammad
    Mahmood, Khawir
    Iltaf, Naima
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (10)
  • [6] Modeling Content Importance for Summarization with Pre-trained Language Models
    Xiao, Liqiang
    Lu Wang
    Hao He
    Jin, Yaohui
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3606 - 3611
  • [7] Probing the Robustness of Pre-trained Language Models for Entity Matching
    Rastaghi, Mehdi Akbarian
    Kamalloo, Ehsan
    Rafiei, Davood
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3786 - 3790
  • [8] ENTSUM: A Data Set for Entity-Centric Summarization
    Maddela, Mounica
    Kulkarni, Mayank
    Preotiuc-Pietro, Daniel
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3355 - 3366
  • [9] Pre-trained language models with domain knowledge for biomedical extractive summarization
    Xie Q.
    Bishop J.A.
    Tiwari P.
    Ananiadou S.
    Knowledge-Based Systems, 2022, 252
  • [10] Entity Linking of Sound Recordings and Compositions with Pre-trained Language Models
    Katakis, Nikiforos
    Vikatos, Pantelis
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST), 2021, : 474 - 481