Automatic Text Summarization Using Latent Semantic Analysis

被引:20
|
作者
Mashechkin, I. V. [1 ]
Petrovskiy, M. I. [1 ]
Popov, D. S. [1 ]
Tsarev, D. V. [1 ]
机构
[1] Moscow MV Lomonosov State Univ, Dept Computat Math & Cybernet, Moscow 119991, Russia
关键词
Singular Value Decomposition; Latent Semantic Analysis; Original Text; Nonnegative Matrix Factorization; Model Summary;
D O I
10.1134/S0361768811060041
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the paper, the most state-of-the-art methods of automatic text summarization, which build summaries in the form of generic extracts, are considered. The original text is represented in the form of a numerical matrix. Matrix columns correspond to text sentences, and each sentence is represented in the form of a vector in the term space. Further, latent semantic analysis is applied to the matrix obtained to construct sentences representation in the topic space. The dimensionality of the topic space is much less than the dimensionality of the initial term space. The choice of the most important sentences is carried out on the basis of sentences representation in the topic space. The number of important sentences is defined by the length of the demanded summary. This paper also presents a new generic text summarization method that uses nonnegative matrix factorization to estimate sentence relevance. Proposed sentence relevance estimation is based on normalization of topic space and further weighting of each topic using sentences representation in topic space. The proposed method shows better summarization quality and performance than state-of-the-art methods on the DUC 2001 and DUC 2002 standard data sets.
引用
收藏
页码:299 / 305
页数:7
相关论文
共 50 条
  • [1] Automatic text summarization using latent semantic analysis
    I. V. Mashechkin
    M. I. Petrovskiy
    D. S. Popov
    D. V. Tsarev
    Programming and Computer Software, 2011, 37 : 299 - 305
  • [2] Automatic Text Summarization of Konkani Texts Using Latent Semantic Analysis
    D'Silva, Jovi
    Sharma, Uzzal
    More, Chaitali
    INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, ICICC 2022, VOL 1, 2023, 473 : 425 - 437
  • [3] Text summarization using Latent Semantic Analysis
    Ozsoy, Makbule Gulcin
    Alpaslan, Ferda Nur
    Cicekli, Ilyas
    JOURNAL OF INFORMATION SCIENCE, 2011, 37 (04) : 405 - 417
  • [4] KANNADA TEXT SUMMARIZATION USING LATENT SEMANTIC ANALYSIS
    Geetha, J. K.
    Deepamala, N.
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 1508 - 1512
  • [5] Automatic text summarization based on latent semantic indexing
    Ai, Dongmei
    Zheng, Yuchao
    Zhang, Dezheng
    ARTIFICIAL LIFE AND ROBOTICS, 2010, 15 (01) : 25 - 29
  • [6] Text summarization using a trainable summarizer and latent semantic analysis
    Yeh, JY
    Ke, HR
    Yang, WP
    Meng, IH
    INFORMATION PROCESSING & MANAGEMENT, 2005, 41 (01) : 75 - 95
  • [7] Chinese text summarization using a trainable summarizer and latent semantic analysis
    Yeh, JY
    Ke, HR
    Yang, WP
    DIGITAL LIBRARIES: PEOPLE, KNOWLEDGE, AND TECHNOLOGY, PROCEEDINGS, 2002, 2555 : 76 - 87
  • [8] Latent Topic-semantic Indexing based Automatic Text Summarization
    Yu, Jiangsheng
    Chen, Xue-wen
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 120 - 126
  • [9] A Hybrid Approach of Text Summarization Using Latent Semantic Analysis and Deep Learning
    Shah, Chintan
    Jivani, Anjali
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 2039 - 2044
  • [10] NLP Based Latent Semantic Analysis for Legal Text Summarization
    Merchant, Kaiz
    Pande, Yash
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 1803 - 1807