Automatic Text Summarization Using Latent Semantic Analysis

被引:20
|
作者
Mashechkin, I. V. [1 ]
Petrovskiy, M. I. [1 ]
Popov, D. S. [1 ]
Tsarev, D. V. [1 ]
机构
[1] Moscow MV Lomonosov State Univ, Dept Computat Math & Cybernet, Moscow 119991, Russia
关键词
Singular Value Decomposition; Latent Semantic Analysis; Original Text; Nonnegative Matrix Factorization; Model Summary;
D O I
10.1134/S0361768811060041
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the paper, the most state-of-the-art methods of automatic text summarization, which build summaries in the form of generic extracts, are considered. The original text is represented in the form of a numerical matrix. Matrix columns correspond to text sentences, and each sentence is represented in the form of a vector in the term space. Further, latent semantic analysis is applied to the matrix obtained to construct sentences representation in the topic space. The dimensionality of the topic space is much less than the dimensionality of the initial term space. The choice of the most important sentences is carried out on the basis of sentences representation in the topic space. The number of important sentences is defined by the length of the demanded summary. This paper also presents a new generic text summarization method that uses nonnegative matrix factorization to estimate sentence relevance. Proposed sentence relevance estimation is based on normalization of topic space and further weighting of each topic using sentences representation in topic space. The proposed method shows better summarization quality and performance than state-of-the-art methods on the DUC 2001 and DUC 2002 standard data sets.
引用
收藏
页码:299 / 305
页数:7
相关论文
共 50 条
  • [31] Automatic Answer Assessment in LMS using Latent Semantic Analysis
    Thomas, N. T.
    Kumar, Ashwini
    Bijlani, Kamal
    SECOND INTERNATIONAL SYMPOSIUM ON COMPUTER VISION AND THE INTERNET (VISIONNET'15), 2015, 58 : 257 - 264
  • [32] SGATS: Semantic Graph-based Automatic Text Summarization from Hindi Text Documents
    Joshi, Manju Lata
    Joshi, Nisheeth
    Mittal, Namita
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (06)
  • [33] Automatic text classification using neuronets algorithms and semantic analysis
    Andreev, A
    Berezkin, D
    Morozov, V
    Simakov, K
    DIGITAL LIBRARIES: ADVANCED METHODS AND TECHNOLOGIES, DIGITAL COLLECTIONS, 2003, : 140 - 149
  • [34] Towards Personalized Video Summarization using Synchronized Comments and Probabilistic Latent Semantic Analysis
    Chung, Cheng-Tao
    Hsiung, Hsin-Kuan
    Wei, Cheng-Kuang
    Lee, Lin-shan
    2014 IEEE 3RD GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2014, : 414 - 415
  • [35] Automatic Arabic Text Summarization Using Analogical Proportions
    Bilel Elayeb
    Amina Chouigui
    Myriam Bounhas
    Oussama Ben Khiroun
    Cognitive Computation, 2020, 12 : 1043 - 1069
  • [36] Improving the Performance of Text Categorization using Automatic Summarization
    Jiang Xiao-Yu
    Fan Xiao-Zhong
    Wang Zhi-Fei
    Jia Ke-Liang
    2009 INTERNATIONAL CONFERENCE ON COMPUTER MODELING AND SIMULATION, PROCEEDINGS, 2009, : 347 - +
  • [37] AUTOMATIC TEXT SUMMARIZATION USING SUPPORT VECTOR MACHINE
    Begum, Nadira
    Fattah, Mohamed Abdel
    Ren, Fuji
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (07): : 1987 - 1996
  • [38] Automatic text summarization using a machine learning approach
    Neto, JL
    Freitas, AA
    Kaestner, CAA
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 2507 : 205 - 215
  • [39] Text mining using nonnegative matrix factorization and latent semantic analysis
    Hassani, Ali
    Iranmanesh, Amir
    Mansouri, Najme
    Neural Computing and Applications, 2021, 33 (20) : 13745 - 13766
  • [40] Text mining using nonnegative matrix factorization and latent semantic analysis
    Ali Hassani
    Amir Iranmanesh
    Najme Mansouri
    Neural Computing and Applications, 2021, 33 : 13745 - 13766