Comparing keyword extraction techniques for WEBSOM text archives

被引：0

作者：

Azcarraga, AP ^{[1
]}

Yap, TN ^{[1
]}

机构：

[1] Natl Univ Singapore, Sch Comp, PRIS Grp, Singapore 117543, Singapore

来源：

ICTAI 2001: 13TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS | 2001年

关键词：

D O I：

10.1109/ICTAI.2001.974464

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The WEBSOM methodology for building very large text archives has a very, slow method for extracting meaningful unit labels. This is because the method computes for the relative frequencies of all the words of all the documents associated to each unit and then compares these to the relative frequencies of all the words of all the other units of the map. Since maps may have more than 100,000 units and the archive may contain tip to 7 million documents, the existing WEBSOM method is not practical. A fast alternative method is based on the distribution of weights in the weight vectors of the trained map, plus a simple manipulation of the random projection matrix used for input data compression. Comparisons made using a WEBSOM archive of the Reuters text collection reveal that a high percentage of keywords extracted using this method match the keywords extracted for the same map units using the original WEBSOM method.

引用

页码：187 / 194

页数：8

共 50 条

[31] Keyword Combination Extraction in Text Categorization Based on Ant Colony Optimization
Yu, Zi-jun
Wu, Wei-gang
Xiao, Jing
Zhang, Jun
Huang, Rui-Zhang
Liu, Ou
2009 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION, 2009, : 430 - +
[32] Performance Analysis of Keyword Extraction Algorithms Assessing Extractive Text Summarization
Kumar, Akshi
Sharma, Aditi
Sharma, Sidhant
Kashyap, Shashwat
2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND ELECTRONICS (COMPTELIX), 2017, : 408 - 414
[33] An Empirical Study of Important Keyword Extraction Techniques from Documents
Hasan, H. M. Mahedi
Sanyal, Falguni
Chaki, Dipankar
Ali, Md. Haider
2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 91 - 94
[34] Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis
Miah, M. Saef Ullah
Sulaiman, Junaida
Bin Sarwar, Talha
Zamli, Kamal Z.
Jose, Rajan
COMPLEXITY, 2021, 2021
[35] Automatic Keyword Extraction from Bengali Text using Improved RAKE Approach
Haque, Mozammel
2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
[36] Uyghur-Kazakh-Kirghiz Text Keyword Extraction Based on Morpheme Segmentation
Parhat, Sardar
Sattar, Mutallip
Hamdulla, Askar
Kadir, Abdurahman
INFORMATION, 2023, 14 (05)
[37] Variance-based features for keyword extraction in Persian and English text documents
Veisi, H.
Aflaki, N.
Parsafard, P.
SCIENTIA IRANICA, 2020, 27 (03) : 1301 - 1315
[38] Research on Cross Language Text Keyword Extraction Based on Information Entropy and TextRank
Zhang, Xiaoyu
Wang, Yongbin
Wu, Lin
PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 16 - 19
[39] Variance-based features for keyword extraction in Persian and English text documents
Veisi H.
Aflaki N.
Parsafard P.
Scientia Iranica, 2020, 27 (3 D) : 1301 - 1315
[40] Chinese Text Keyword Extraction Based on Doc2vec And TextRank
Wang, Wei
Li, Xiangshun
Yu, Sheng
PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 369 - 373

← 1 2 3 4 5 →