Automatic Text Summarization Using Latent Semantic Analysis

被引:20
|
作者
Mashechkin, I. V. [1 ]
Petrovskiy, M. I. [1 ]
Popov, D. S. [1 ]
Tsarev, D. V. [1 ]
机构
[1] Moscow MV Lomonosov State Univ, Dept Computat Math & Cybernet, Moscow 119991, Russia
关键词
Singular Value Decomposition; Latent Semantic Analysis; Original Text; Nonnegative Matrix Factorization; Model Summary;
D O I
10.1134/S0361768811060041
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the paper, the most state-of-the-art methods of automatic text summarization, which build summaries in the form of generic extracts, are considered. The original text is represented in the form of a numerical matrix. Matrix columns correspond to text sentences, and each sentence is represented in the form of a vector in the term space. Further, latent semantic analysis is applied to the matrix obtained to construct sentences representation in the topic space. The dimensionality of the topic space is much less than the dimensionality of the initial term space. The choice of the most important sentences is carried out on the basis of sentences representation in the topic space. The number of important sentences is defined by the length of the demanded summary. This paper also presents a new generic text summarization method that uses nonnegative matrix factorization to estimate sentence relevance. Proposed sentence relevance estimation is based on normalization of topic space and further weighting of each topic using sentences representation in topic space. The proposed method shows better summarization quality and performance than state-of-the-art methods on the DUC 2001 and DUC 2002 standard data sets.
引用
收藏
页码:299 / 305
页数:7
相关论文
共 50 条
  • [41] Text mining using nonnegative matrix factorization and latent semantic analysis
    Hassani, Ali
    Iranmanesh, Amir
    Mansouri, Najme
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (20): : 13745 - 13766
  • [42] Automatic Arabic Text Summarization Using Analogical Proportions
    Elayeb, Bilel
    Chouigui, Amina
    Bounhas, Myriam
    Ben Khiroun, Oussama
    COGNITIVE COMPUTATION, 2020, 12 (05) : 1043 - 1069
  • [43] Automatic Text Summarization Using Internal and External Information
    Sarkar, Kamal
    PROCEEDINGS OF 2018 FIFTH INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2018,
  • [44] Automatic Text Summarization for Indonesian Language Using TextTeaser
    Gunawan, D.
    Pasaribu, A.
    Rahmat, R. F.
    Budiarto, R.
    IAES INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTER SCIENCE AND INFORMATICS, 2017, 190
  • [45] Automatic Text Summarization of Video Lectures Using Subtitles
    Garg, Shruti
    RECENT DEVELOPMENTS IN INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, ICCD 2016, 2017, 555 : 45 - 52
  • [46] Automatic Text Summarization using Kernel Ridge Regression
    Onita, Daniela
    Cucu, Ciprian
    2023 25TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, SYNASC 2023, 2023, : 202 - 209
  • [47] An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization
    Al-Sabahi, Kamal
    Zhang, Zuping
    Long, Jun
    Alwesabi, Khaled
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2018, 43 (12) : 8079 - 8094
  • [48] An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization
    Kamal Al-Sabahi
    Zuping Zhang
    Jun Long
    Khaled Alwesabi
    Arabian Journal for Science and Engineering, 2018, 43 : 8079 - 8094
  • [49] An Adaptive Latent Semantic Analysis for Text mining
    Hong T. Tu
    Tuoi T. Phan
    Khu P. Nguyen
    2017 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2017, : 588 - 593
  • [50] Automatic Text Summarization and Classification
    Simske, Steven J.
    Lins, Rafael
    PROCEEDINGS OF THE ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG 2018), 2018,