Extractive multi-document text summarization based on graph independent sets

被引:38
|
作者
Uckan, Taner [1 ]
Karci, Ali [2 ]
机构
[1] Van Yuzuncu Yil Univ, Comp Programming Dept, TR-65000 Van, Turkey
[2] Inonu Univ, Dept Comp Engn, TR-44000 Malatya, Turkey
关键词
Graph independent set; Graph-based document summarization; Generic document summarization; Extractive text summarization; Multi document text summarization;
D O I
10.1016/j.eij.2019.12.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel methodology for extractive, generic summarization of text documents. The Maximum Independent Set, which has not been used previously in any summarization study, has been utilized within the context of this study. In addition, a text processing tool, which we named KUSH, is suggested in order to preserve the semantic cohesion between sentences in the representation stage of introductory texts. Our anticipation was that the set of sentences corresponding to the nodes in the independent set should be excluded from the summary. Based on this anticipation, the nodes forming the Independent Set on the graphs are identified and removed from the graph. Thus, prior to quantification of the effect of the nodes on the global graph, a limitation is applied on the documents to be summarized. This limitation prevents repetition of word groups to be included in the summary. Performance of the proposed approach on the Document Understanding Conference (DUC-2002 and DUC-2004) datasets was calculated using ROUGE evaluation metrics. The developed model achieved a 0.38072 ROUGE performance value for 100-word summaries, 0.51954 for 200-word summaries, and 0.59208 for 400-word summaries. The values reported throughout the experimental processes of the study reveal the contribution of this innovative method. (C) 2019 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Artificial Intelligence, Cairo University.
引用
收藏
页码:145 / 157
页数:13
相关论文
共 50 条
  • [41] Extractive Multi-Document Summarization: A Review of Progress in the Last Decade
    Jalil, Zakia
    Nasir, Jamal Abdul
    Nasir, Muhammad
    IEEE ACCESS, 2021, 9 : 130928 - 130946
  • [42] MSCSO: Extractive Multi-document Summarization Based on a New Criterion of Sentences Overlapping
    Zeynab Khaleghi
    Mohammad Fakhredanesh
    Maryam Hourali
    Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2021, 45 : 195 - 205
  • [43] Extractive multi-document summarization using population-based multicriteria optimization
    John, Ansamma
    Premjith, P. S.
    Wilscy, M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 86 : 385 - 397
  • [44] Using Proximity in Query Focused Multi-document Extractive Summarization
    Li, Sujian
    Zhang, Yu
    Wang, Wei
    Wang, Chen
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 : 179 - 188
  • [45] A document-sensitive graph model for multi-document summarization
    Wei, Furu
    Li, Wenjie
    Lu, Qin
    He, Yanxiang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 22 (02) : 245 - 259
  • [46] Literature Study on Multi-document Text Summarization Techniques
    Shah, Chintan
    Jivani, Anjali
    SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 442 - 451
  • [47] Improving Multi-Document Summarization via Text Classification
    Cao, Ziqiang
    Li, Wenjie
    Li, Sujian
    Wei, Furu
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3053 - 3059
  • [48] Multi-Document Text Summarization for Competitor Intelligence : A Methodology
    Chakraborti, Swapnajit
    Dey, Shubhamoy
    PROCEEDINGS OF 2014 2ND INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2014, : 97 - 100
  • [49] Multi-document Text Summarization Using Sentence Extraction
    Ahuja, Ravinder
    Anand, Willson
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 235 - 242
  • [50] A Query Specific Graph Based Approach to Multi-document Text Summarization: Simultaneous Cluster and Sentence Ranking
    Pandit, Sandip R.
    Potey, M. A.
    2013 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND RESEARCH ADVANCEMENT (ICMIRA 2013), 2013, : 213 - 217