Research on Cross Language Text Keyword Extraction Based on Information Entropy and TextRank

被引:0
|
作者
Zhang, Xiaoyu [1 ]
Wang, Yongbin [1 ]
Wu, Lin [1 ]
机构
[1] Commun Univ China, Internet Informat Res Inst, Beijing 100024, Peoples R China
关键词
component; information entropy; TextRank; keyword extraction; Cross language keyword extraction;
D O I
10.1109/itnec.2019.8728993
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to extract keywords from cross-language documents as accurately as possible, especially for the language whose keyword extraction technology is not mature, a text keyword extraction method based on information entropy and TextRank is proposed to extract the accurate keywords from the translated Chinese documents. This method determines the basic importance of words according to the information entropy of words, and then uses the information entropy of words to vote iteratively through the TextRank algorithm. This method solves the problem that TextRank algorithm easily extracts frequent non key words as keywords. The experimental results show that the proposed method can extract keywords more accurately than TextRank in the processing of cross-lingual bilingual translated documents.
引用
收藏
页码:16 / 19
页数:4
相关论文
共 50 条
  • [1] Chinese Text Keyword Extraction Based on Doc2vec And TextRank
    Wang, Wei
    Li, Xiangshun
    Yu, Sheng
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 369 - 373
  • [2] Research on Keyword Extraction Algorithm Using PMI and TextRank
    Yang Tao
    Zhu Cui
    Zhang Jiazhe
    2019 IEEE 2ND INTERNATIONAL CONFERENCE ON INFORMATION AND COMPUTER TECHNOLOGIES (ICICT), 2019, : 5 - 9
  • [3] Research on Keyword Extraction Based on Word2Vec Weighted TextRank
    Wen, Yujun
    Yuan, Hui
    Zhang, Pengzhou
    2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 2109 - 2113
  • [4] Key Information Extraction of Forestry Text Based on TextRank and Clusters Filtering
    Chen Z.
    Li Y.
    Xu F.
    Feng G.
    Shi D.
    Cui X.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2020, 51 (05): : 207 - 214and172
  • [5] Multifeature Fusion Keyword Extraction Algorithm Based on TextRank
    Guo, Wenming
    Wang, Zihao
    Han, Fang
    IEEE ACCESS, 2022, 10 : 71805 - 71813
  • [6] Research on Text Classification Based on TextRank
    Lu, Guangming
    Xia, Yule
    Wang, Jiamei
    Yang, Zhenling
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, INFORMATION MANAGEMENT AND NETWORK SECURITY, 2016, 47 : 319 - 322
  • [7] Research of Chinese Keyword Extraction Based on Weibo Information
    Xue, Juntao
    Zhao, Yunfeng
    Guo, Hao
    Li, Kaiyu
    Zhu, Xinshan
    2018 13TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2018, : 1178 - 1183
  • [8] Keyword Acquisition for Language Composition Based on TextRank Automatic Summarization Approach
    Jiang, Yan
    Xiang, Chunlin
    Li, Lingtong
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (04) : 994 - 1005
  • [9] Tag-textrank: A webpage keyword extraction method based on tags
    Li, Peng
    Wang, Bin
    Shi, Zhiwei
    Cui, Yachao
    Li, Hengxun
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2012, 49 (11): : 2344 - 2351
  • [10] Text Keyword Extraction Based on GPT
    He, Pinyao
    Huang, Jingyue
    Li, Ming
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1394 - 1398