ATSSC: Development of an approach based on soft computing for text summarization

被引:22
|
作者
Tayal, Madhuri A. [1 ]
Raghuwanshi, Mukesh M. [2 ]
Malik, Latesh G. [1 ]
机构
[1] GH Raisoni Coll Engn, Dept Comp Sci & Engn, Nagpur, Maharashtra, India
[2] Yeshwantrao Chavhan Coll Engn, Dept Comp Sci & Engn, Nagpur, Maharashtra, India
来源
关键词
Text document; Summarization; Semantic representation; Clustering; Reference resolution; Evaluation;
D O I
10.1016/j.csl.2016.07.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing (NLP) is a field of computer science and linguistics concerned with the unique conversation between computers and human languages. It processes data through Lexical analysis, Syntax analysis, Semantic analysis, Discourse processing and Pragmatic analysis. An intelligent text summarization is one of the most challenging tasks in Natural language processing. It can be further used for applications like storytelling and question answering. This paper presents an automatic text summarizer for text documents using soft computing approach, consisting of SVO (Subject, Verb, and Object) Rules and Tag based training. This approach processes data through POS Tagger, NLP Parser, ambiguity removal, Semantic Representation, Sentence Reduction and Sentence Combination. At first, this paper defines the theme (title) of the document. After this operation, it preprocesses text document to perform pronominal reference resolution and text clustering. After these preprocessing operations, it identifies and removes ambiguity from the language using parser. And then, it calculates the score for the sentences using the title of the document, Semantic Sentence Similarity utility and n-gram Co-Occurrence relations of the words in a particular sentence. At last, sentences are combined with the SVO Rules after providing tag based training for simple and complex sentences. The summarizer was tested on the standard DUC 2007 dataset as well as a corpus of hundred text documents of different domains created by us. DUC 2007 Update Task produced accuracy F-scores of 0.13523 (ROUGE-2) and 0.112561 (ROUGE-SU4) for DUC 2007 documents and 0.4036 (ROUGE-2) and 0.3129 (ROUGE-SU4) for our corpus. Subjective evaluation was carried out by five language experts and twenty random individuals for system generated sample summaries. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:214 / 235
页数:22
相关论文
共 50 条
  • [31] PSO-Based Text Summarization Approach Using Sentiment Analysis
    Mandal, Shrabanti
    Singh, Girish Kumar
    Pal, Anita
    COMPUTING, COMMUNICATION AND SIGNAL PROCESSING, ICCASP 2018, 2019, 810 : 845 - 854
  • [32] SumCR: A new subtopic-based extractive approach for text summarization
    Mei, Jian-Ping
    Chen, Lihui
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 31 (03) : 527 - 545
  • [33] Improving Text Summarization using Ensembled Approach based on Fuzzy with LSTM
    Minakshi Tomer
    Manoj Kumar
    Arabian Journal for Science and Engineering, 2020, 45 : 10743 - 10754
  • [34] The diversity-based approach to open-domain text summarization
    Nomoto, T
    Matsumoto, Y
    INFORMATION PROCESSING & MANAGEMENT, 2003, 39 (03) : 363 - 389
  • [35] A Similarity-Based Abstract Argumentation Approach to Extractive Text Summarization
    Ferilli, Stefano
    Pazienza, Andrea
    Angelastro, Sergio
    Suglia, Alessandro
    AI*IA 2017 ADVANCES IN ARTIFICIAL INTELLIGENCE, 2017, 10640 : 87 - 100
  • [36] A New Text Summarization Approach based on Relative Entropy and Document Decomposition
    Alharbe, Nawaf
    Rakrouki, Mohamed Ali
    Aljohani, Abeer
    Khayyat, Mashael
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (03) : 610 - 618
  • [37] SumCR: A new subtopic-based extractive approach for text summarization
    Jian-Ping Mei
    Lihui Chen
    Knowledge and Information Systems, 2012, 31 : 527 - 545
  • [38] Enhancing Fuzzy Based Text Summarization Technique Using Genetic Approach
    Reddy, R. Pallavi
    Nara, Kalyani
    PROCEEDINGS OF FIRST INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS: VOL 1, 2016, 50 : 447 - 457
  • [39] Improving Text Summarization using Ensembled Approach based on Fuzzy with LSTM
    Tomer, Minakshi
    Kumar, Manoj
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2020, 45 (12) : 10743 - 10754
  • [40] An Abstract Argumentation-Based Approach to Automatic Extractive Text Summarization
    Ferilli, Stefano
    Pazienza, Andrea
    DIGITAL LIBRARIES AND MULTIMEDIA ARCHIVES, IRCDL 2018, 2018, 806 : 57 - 68