Automatic Text Summarization Using Latent Semantic Analysis

被引:20
|
作者
Mashechkin, I. V. [1 ]
Petrovskiy, M. I. [1 ]
Popov, D. S. [1 ]
Tsarev, D. V. [1 ]
机构
[1] Moscow MV Lomonosov State Univ, Dept Computat Math & Cybernet, Moscow 119991, Russia
关键词
Singular Value Decomposition; Latent Semantic Analysis; Original Text; Nonnegative Matrix Factorization; Model Summary;
D O I
10.1134/S0361768811060041
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the paper, the most state-of-the-art methods of automatic text summarization, which build summaries in the form of generic extracts, are considered. The original text is represented in the form of a numerical matrix. Matrix columns correspond to text sentences, and each sentence is represented in the form of a vector in the term space. Further, latent semantic analysis is applied to the matrix obtained to construct sentences representation in the topic space. The dimensionality of the topic space is much less than the dimensionality of the initial term space. The choice of the most important sentences is carried out on the basis of sentences representation in the topic space. The number of important sentences is defined by the length of the demanded summary. This paper also presents a new generic text summarization method that uses nonnegative matrix factorization to estimate sentence relevance. Proposed sentence relevance estimation is based on normalization of topic space and further weighting of each topic using sentences representation in topic space. The proposed method shows better summarization quality and performance than state-of-the-art methods on the DUC 2001 and DUC 2002 standard data sets.
引用
收藏
页码:299 / 305
页数:7
相关论文
共 50 条
  • [21] An Approach to Automatic Text Summarization using WordNet
    Pal, Alok Ranjan
    Saha, Diganta
    SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 1169 - 1173
  • [22] Automatic Text Summarization
    Fattah, Mohamed Abdel
    Ren, Fuji
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 27, 2008, 27 : 192 - +
  • [23] Automatic Text Summarization using Word Embeddings
    Easwar, Arjun
    Uthra, Annie
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 1065 - 1079
  • [24] Latent semantic analysis for text categorization using neural network
    Yu, Bo
    Xu, Zong-ben
    Li, Cheng-hua
    KNOWLEDGE-BASED SYSTEMS, 2008, 21 (08) : 900 - 904
  • [25] Latent semantic analysis for text segmentation
    Choi, FYY
    Wiemer-Hastings, P
    Moore, J
    PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2001, : 109 - 117
  • [26] Semantic Graph Based Automatic Text Summarization for Hindi Documents Using Particle Swarm Optimization
    Dalal, Vipul
    Malik, Latesh
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS (ICTIS 2017) - VOL 2, 2018, 84 : 284 - 289
  • [27] Extractive Automatic Text Summarization Based on Lexical-Semantic Keywords
    Hernandez-Castaneda, Angel
    Arnulfo Garcia-Hernandez, Rene
    Ledeneva, Yulia
    Eduardo Millan-Hernandez, Christian
    IEEE ACCESS, 2020, 8 : 49896 - 49907
  • [28] Improved spoken document summarization using Probabilistic Latent Semantic Analysis (PLSA)
    Kong, Sheng-Yi
    Lee, Lin-shan
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 941 - 944
  • [29] Latent Semantic Analysis: An Approach to Understand Semantic of Text
    Kherwa, Pooja
    Bansal, Poonam
    2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 870 - 874
  • [30] Text summarization evaluation using semantic probability distributions
    Le, Anh
    Wu, Fred
    Vu, Lan
    Le, Thanh
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 207 - 212