Extractive multi-document text summarization based on graph independent sets

被引:38
|
作者
Uckan, Taner [1 ]
Karci, Ali [2 ]
机构
[1] Van Yuzuncu Yil Univ, Comp Programming Dept, TR-65000 Van, Turkey
[2] Inonu Univ, Dept Comp Engn, TR-44000 Malatya, Turkey
关键词
Graph independent set; Graph-based document summarization; Generic document summarization; Extractive text summarization; Multi document text summarization;
D O I
10.1016/j.eij.2019.12.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel methodology for extractive, generic summarization of text documents. The Maximum Independent Set, which has not been used previously in any summarization study, has been utilized within the context of this study. In addition, a text processing tool, which we named KUSH, is suggested in order to preserve the semantic cohesion between sentences in the representation stage of introductory texts. Our anticipation was that the set of sentences corresponding to the nodes in the independent set should be excluded from the summary. Based on this anticipation, the nodes forming the Independent Set on the graphs are identified and removed from the graph. Thus, prior to quantification of the effect of the nodes on the global graph, a limitation is applied on the documents to be summarized. This limitation prevents repetition of word groups to be included in the summary. Performance of the proposed approach on the Document Understanding Conference (DUC-2002 and DUC-2004) datasets was calculated using ROUGE evaluation metrics. The developed model achieved a 0.38072 ROUGE performance value for 100-word summaries, 0.51954 for 200-word summaries, and 0.59208 for 400-word summaries. The values reported throughout the experimental processes of the study reveal the contribution of this innovative method. (C) 2019 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Artificial Intelligence, Cairo University.
引用
收藏
页码:145 / 157
页数:13
相关论文
共 50 条
  • [31] Multi-document summarization based on link analysis and text classification
    Wu, JQ
    Wu, YZ
    Liu, J
    Zhuang, YT
    DIGITAL LIBRARIES: INTERNATIONAL COLLABORATION AND CROSS-FERTILIZATION, PROCEEDINGS, 2004, 3334 : 649 - 649
  • [32] Multi-document Arabic Text Summarization based on Thematic Annotation
    Merniz, Amina
    Chaibi, Anja Habacha
    Ben Ghezala, Henda
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES (ICSOFT), 2021, : 639 - 644
  • [33] Unsupervised Graph-Based Tibetan Multi-Document Summarization
    Yan, Xiaodong
    Wang, Yiqin
    Wei Song
    Zhao, Xiaobing
    Run, A.
    Yang Yanxing
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (01): : 1769 - 1781
  • [34] SRRank: Leveraging Semantic Roles for Extractive Multi-Document Summarization
    Yan, Su
    Wan, Xiaojun
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 2048 - 2058
  • [35] A Preliminary Exploration of Extractive Multi-Document Summarization in Hyperbolic Space
    Song, Mingyang
    Feng, Yi
    Jing, Liping
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4505 - 4509
  • [36] The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization
    Sanchez-Gomez, Jesus M.
    Vega-Rodriguez, Miguel A.
    Perez, Carlos J.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169
  • [37] A document-sensitive graph model for multi-document summarization
    Furu Wei
    Wenjie Li
    Qin Lu
    Yanxiang He
    Knowledge and Information Systems, 2010, 22 : 245 - 259
  • [38] Multi-document Extractive Summarization Using Window-based Sentence Representation
    Zhang, Yong
    Er, Meng Joo
    Zhao, Rui
    2015 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2015, : 404 - 410
  • [39] MSCSO: Extractive Multi-document Summarization Based on a New Criterion of Sentences Overlapping
    Khaleghi, Zeynab
    Fakhredanesh, Mohammad
    Hourali, Maryam
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2021, 45 (01) : 195 - 205
  • [40] A hybrid model for sentence ordering in extractive multi-document summarization
    Liu, Dexi
    Zhang, Zengchang
    He, Yanxiang
    Ji, Donghong
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 588 - 592