ATSSC: Development of an approach based on soft computing for text summarization

被引:22
|
作者
Tayal, Madhuri A. [1 ]
Raghuwanshi, Mukesh M. [2 ]
Malik, Latesh G. [1 ]
机构
[1] GH Raisoni Coll Engn, Dept Comp Sci & Engn, Nagpur, Maharashtra, India
[2] Yeshwantrao Chavhan Coll Engn, Dept Comp Sci & Engn, Nagpur, Maharashtra, India
来源
关键词
Text document; Summarization; Semantic representation; Clustering; Reference resolution; Evaluation;
D O I
10.1016/j.csl.2016.07.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing (NLP) is a field of computer science and linguistics concerned with the unique conversation between computers and human languages. It processes data through Lexical analysis, Syntax analysis, Semantic analysis, Discourse processing and Pragmatic analysis. An intelligent text summarization is one of the most challenging tasks in Natural language processing. It can be further used for applications like storytelling and question answering. This paper presents an automatic text summarizer for text documents using soft computing approach, consisting of SVO (Subject, Verb, and Object) Rules and Tag based training. This approach processes data through POS Tagger, NLP Parser, ambiguity removal, Semantic Representation, Sentence Reduction and Sentence Combination. At first, this paper defines the theme (title) of the document. After this operation, it preprocesses text document to perform pronominal reference resolution and text clustering. After these preprocessing operations, it identifies and removes ambiguity from the language using parser. And then, it calculates the score for the sentences using the title of the document, Semantic Sentence Similarity utility and n-gram Co-Occurrence relations of the words in a particular sentence. At last, sentences are combined with the SVO Rules after providing tag based training for simple and complex sentences. The summarizer was tested on the standard DUC 2007 dataset as well as a corpus of hundred text documents of different domains created by us. DUC 2007 Update Task produced accuracy F-scores of 0.13523 (ROUGE-2) and 0.112561 (ROUGE-SU4) for DUC 2007 documents and 0.4036 (ROUGE-2) and 0.3129 (ROUGE-SU4) for our corpus. Subjective evaluation was carried out by five language experts and twenty random individuals for system generated sample summaries. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:214 / 235
页数:22
相关论文
共 50 条
  • [1] Automatic Text Summarization: Soft Computing Based Approaches
    Azhari, Muhammad
    Kumar, Yogan Jaya
    Goh, Ong Sing
    Ngo, Hea Choon
    ADVANCED SCIENCE LETTERS, 2018, 24 (02) : 1206 - 1209
  • [2] A soft computing approach to big data summarization
    Smits, Gregory
    Pivert, Olivier
    Yager, Ronald R.
    Nerzic, Pierre
    FUZZY SETS AND SYSTEMS, 2018, 348 : 4 - 20
  • [3] Topic Modeling Based Text Summarization Approach
    Yu, Shusi
    Wang, Wei
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING APPLICATIONS (CSEA 2015), 2015, : 203 - 207
  • [4] A two layered case based reasoning approach to text summarization, based on summarization pattern
    Reyhani, N
    Badie, K
    Kharrat, M
    2003 IEEE SYSTEMS & INFORMATION ENGINEERING DESIGN SYMPOSIUM, 2003, : 47 - 50
  • [5] Indian Legal Text Summarization: A Text Normalization-based Approach
    Ghosh, Satyajit
    Dutta, Mousumi
    Das, Tanaya
    2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [6] An approach to sentence-selection-based text summarization
    Chen, F
    Han, KS
    Chen, GL
    2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 489 - 493
  • [7] ANALYSING FUZZY BASED APPROACH FOR EXTRACTIVE TEXT SUMMARIZATION
    Sharaff, Aakanksha
    Khaire, Amit Siddharth
    Sharma, Dimple
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 906 - 910
  • [8] A Trigraph based Centrality Approach towards Text Summarization
    Raj, Devika
    Geetha, M.
    PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2018, : 796 - 801
  • [9] An approach to Abstractive Text Summarization
    Huong Thanh Le
    Tien Manh Le
    2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2013, : 371 - 376
  • [10] A Modification to Graph Based Approach for Extraction Based Automatic Text Summarization
    Sehgal, Sunchit
    Kumar, Badal
    Maheshwar
    Rampal, Lakshay
    Chaliya, Ankit
    PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, VOL 2, 2018, 564 : 373 - 378