Somun: entity-centric summarization incorporating pre-trained language models

被引:3
|
作者
Inan, Emrah [1 ]
机构
[1] Univ Manchester, Sch Comp Sci, Natl Ctr Text Min, Manchester, Lancs, England
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 10期
关键词
Automatic text summarization; Language models; Harmonic centrality; FEATURE-EXTRACTION; CENTRALITY;
D O I
10.1007/s00521-020-05319-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization resolves the issue of capturing essential information from a large volume of text data. Existing methods either depend on the end-to-end models or hand-crafted preprocessing steps. In this study, we propose an entity-centric summarization method which extracts named entities and produces a small graph with a dependency parser. To extract entities, we employ well-known pre-trained language models. After generating the graph, we perform the summarization by ranking entities using the harmonic centrality algorithm. Experiments illustrate that we outperform the state-of-the-art unsupervised learning baselines by improving the performance more than 10% for ROUGE-1 and more than 50% for ROUGE-2 scores. Moreover, we achieve comparable results to recent end-to-end models.
引用
收藏
页码:5301 / 5311
页数:11
相关论文
共 50 条
  • [21] ENTSUMV2: Data, Models and Evaluation for More Abstractive Entity-Centric Summarization
    Mehra, Dhruv
    Xie, Lingjue
    Hofmann-Coyle, Ella
    Kulkarni, Mayank
    Preotiuc-Pietro, Daniel
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5538 - 5547
  • [22] Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models
    Izsak, Peter
    Guskin, Shira
    Wasserblat, Moshe
    FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 44 - 47
  • [23] Space-Efficient Representation of Entity-centric Query Language Models
    Van Gysel, Christophe
    Hannemann, Mirko
    Pusateri, Ernest
    Oualil, Youssef
    Oparin, Ilya
    INTERSPEECH 2022, 2022, : 679 - 683
  • [24] Entity-centric Summarization: Generating Text Summaries for Graph Snippets
    Chhabra, Shruti
    Bedathur, Srikanta
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 33 - 37
  • [25] Data-Centric Explainable Debiasing for Improving Fairness in Pre-trained Language Models
    Li, Yingji
    Du, Mengnan
    Song, Rui
    Wang, Xin
    Wang, Ying
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3773 - 3786
  • [26] SETEM: Self-ensemble training with Pre-trained Language Models for Entity Matching
    Ding, Huahua
    Dai, Chaofan
    Wu, Yahui
    Ma, Wubin
    Zhou, Haohao
    KNOWLEDGE-BASED SYSTEMS, 2024, 293
  • [27] Annotating Columns with Pre-trained Language Models
    Suhara, Yoshihiko
    Li, Jinfeng
    Li, Yuliang
    Zhang, Dan
    Demiralp, Cagatay
    Chen, Chen
    Tan, Wang-Chiew
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 1493 - 1503
  • [28] LaoPLM: Pre-trained Language Models for Lao
    Lin, Nankai
    Fu, Yingwen
    Yang, Ziyu
    Chen, Chuwei
    Jiang, Shengyi
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6506 - 6512
  • [29] PhoBERT: Pre-trained language models for Vietnamese
    Dat Quoc Nguyen
    Anh Tuan Nguyen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
  • [30] Deciphering Stereotypes in Pre-Trained Language Models
    Ma, Weicheng
    Scheible, Henry
    Wang, Brian
    Veeramachaneni, Goutham
    Chowdhary, Pratim
    Sung, Alan
    Koulogeorge, Andrew
    Wang, Lili
    Yang, Diyi
    Vosoughi, Soroush
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11328 - 11345