Automatic Text Summarization Method Based on Improved TextRank Algorithm and K-Means Clustering

被引:7
|
作者
Liu, Wenjun [1 ,2 ]
Sun, Yuyan [2 ]
Yu, Bao [2 ]
Wang, Hailan [2 ]
Peng, Qingcheng [2 ]
Hou, Mengshu [1 ,3 ]
Guo, Huan [2 ]
Wang, Hai [2 ]
Liu, Cheng [1 ,4 ]
机构
[1] Univ Elect Sci & Technol China UESTC, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
[2] Xihua Univ, Sch Comp & Software Engn, Chengdu 610039, Peoples R China
[3] Chengdu Technol Univ, Sch Big Data & Artificial Intelligence, Chengdu 611730, Peoples R China
[4] 30th Res Inst China Elect Technol Grp Corp, Sci & Technol Commun Secur Lab, Chengdu 610041, Peoples R China
基金
中国国家自然科学基金;
关键词
Text Summarization; Sentence Vector; K -means Clustering; Word Embedding; TextRank Algorithm;
D O I
10.1016/j.knosys.2024.111447
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic text summarization is to obtain a summary by compressing the text while retaining its important information. Then users can obtain the important content of the text by reading the summary. In the research literatures, the extraction summary method is widely used and is also one type of the main research methods of summary methods. However, this extraction summary method still has some problems. The selection of the initial cluster center has not been carefully determined, and the sentence redundancy summarized is high in articles with complex sentences. In order to solve the above problems, this paper proposes an automatic text summarization method based on improved TextRank algorithm and K -Means clustering. This method combines the improved BM25 model and the TextRank algorithm to calculate the BM25 similarity between sentences and obtain the TR scores of sentences. The TR scores are used to select the initial center of clustering based on similarity difference judgment and maximum judgment. The final summary is obtained by combining the cluster scores and sentence scores. The experimental results show that the proposed method in this paper has better evaluation indicators containing ROUGE -1, ROUGE -2 and ROUGE -L than other comparison algorithms including Lead -3, TextRank and MBM25EMB on the DUC2004 dataset. In conclusion, the proposed method in this paper improves the accuracy of automatic text summarization and reduce the redundancy from documents.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] An Improved Method for K-Means Clustering
    Cui, Xiaowei
    Wang, Fuxiang
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 756 - 759
  • [22] Design and application of a text clustering algorithm based on parallelized k-means clustering
    Wang H.
    Zhou C.
    Li L.
    Revue d'Intelligence Artificielle, 2019, 33 (06) : 453 - 460
  • [23] An Extractive Text Summarization Technique for Bengali Document(s) using K-means Clustering Algorithm
    Akter, Sumya
    Asa, Aysa Siddika
    Uddin, Md. Palash
    Hossain, Md. Delowar
    Roy, Shikhor Kumer
    Ibn Afjal, Masud
    2017 IEEE INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2017,
  • [24] A K-means Optimized Clustering Algorithm Based on Improved Genetic Algorithm
    Pu, Qiu-Mei
    Wu, Qiong
    Li, Qian
    Lecture Notes in Electrical Engineering, 2022, 801 LNEE : 133 - 140
  • [25] Improved rough K-means clustering algorithm based on firefly algorithm
    Ye, Tingyu
    Ye, Jun
    Wang, Lei
    INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS, 2023, 17 (01) : 1 - 12
  • [26] A Nonuniform Clustering Routing Algorithm Based on an Improved K-Means Algorithm
    Tang, Xinliang
    Zhang, Man
    Yu, Pingping
    Liu, Wei
    Cao, Ning
    Xu, Yunfeng
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 64 (03): : 1725 - 1739
  • [27] K-means clustering algorithm based on improved flower pollination algorithm
    Jiang, Shuhao
    Wang, Mengyuan
    Guo, Jichang
    Wang, Mengqian
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (03)
  • [28] An Improved K-means text clustering algorithm By Optimizing initial cluster centers
    Xiong, Caiquan
    Hua, Zhen
    Lv, Ke
    Li, Xuan
    2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD), 2016, : 265 - 268
  • [29] Digital image clustering based on improved k-means algorithm
    Gao Xi
    Hu Zi-mu
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2020, 35 (02) : 173 - 179
  • [30] An Improved Sampling K-means Clustering Algorithm Based on MapReduce
    Zhang Ya-ling
    Wang Ya-nan
    2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017,