A text similarity measurement method based on singular value decomposition and semantic relevance

被引:7
|
作者
Li X. [1 ]
Yao C. [1 ]
Fan F. [1 ]
Yu X. [1 ]
机构
[1] School of Information Science and Engineering, Dalian Polytechnic University, Dalian
来源
Li, Xu (lixu102@aliyun.com) | 1600年 / Korea Information Processing Society卷 / 13期
关键词
Natural language processing; Semantic relevance; Singular value decomposition; Text representation; Text similarity measurement;
D O I
10.3745/JIPS.02.0067
中图分类号
学科分类号
摘要
The traditional text similarity measurement methods based on word frequency vector ignore the semantic relationships between words, which has become the obstacle to text similarity calculation, together with the high-dimensionality and sparsity of document vector. To address the problems, the improved singular value decomposition is used to reduce dimensionality and remove noises of the text representation model. The optimal number of singular values is analyzed and the semantic relevance between words can be calculated in constructed semantic space. An inverted index construction algorithm and the similarity definitions between vectors are proposed to calculate the similarity between two documents on the semantic level. The experimental results on benchmark corpus demonstrate that the proposed method promotes the evaluation metrics of F-measure. © 2017 KIPS.
引用
收藏
页码:863 / 875
页数:12
相关论文
共 50 条
  • [1] An efficient method of genetic algorithm for text clustering based on singular value decomposition
    Song, Wei
    Park, Soon Cheol
    2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 53 - 58
  • [2] Short Text Semantic Similarity Measurement Approach Based on Semantic Network
    Hameed, Naamah Hussien
    Alimi, Adel M.
    Sadiq, Ahmed T.
    BAGHDAD SCIENCE JOURNAL, 2022, 19 (06) : 1581 - 1591
  • [3] Computing method of similarity between RNA secondary structures based on singular value decomposition
    Liu, Qi
    Zhang, Yin
    Ye, Xiu-Zi
    Yu, Rong-Dong
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2007, 41 (08): : 1249 - 1254
  • [4] Similarity based classification of ADHD using Singular Value Decomposition
    Eslami, Taban
    Saeed, Fahad
    2018 ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS, 2018, : 19 - 25
  • [5] Text summarization and singular value decomposition
    Steinberger, J
    Jezek, K
    ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 245 - 254
  • [6] Text summarization and singular value decomposition
    Steinberger, Josef
    Jezek, Karel
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3261 : 245 - 254
  • [7] A Text Similarity Measurement Based on Semantic Fingerprint of Characteristic Phrases
    PANG Shanchen
    YAO Jiamin
    LIU Ting
    ZHAO Hua
    CHEN Hongqi
    ChineseJournalofElectronics, 2020, 29 (02) : 233 - 241
  • [8] A Text Similarity Measurement Based on Semantic Fingerprint of Characteristic Phrases
    Pang, Shanchen
    Yao, Jiamin
    Liu, Ting
    Zhao, Hua
    Chen, Hongqi
    CHINESE JOURNAL OF ELECTRONICS, 2020, 29 (02) : 233 - 241
  • [9] Neural network for text classification based on singular value decomposition
    Li, Cheng Hua
    Park, Soon Cheol
    2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 47 - 52
  • [10] An eigentracking method based on singular value decomposition
    Ren, Xianyi
    Zhou, Xiao
    Zhang, Guilin
    Zhang, Tianxu
    Huazhong Ligong Daxue Xuebao/Journal Huazhong (Central China) University of Science and Technology, 2001, 29 (12): : 50 - 51