Comparing keyword extraction techniques for WEBSOM text archives

被引：0

作者：

Azcarraga, AP ^{[1
]}

Yap, TN ^{[1
]}

机构：

[1] Natl Univ Singapore, Sch Comp, PRIS Grp, Singapore 117543, Singapore

来源：

ICTAI 2001: 13TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS | 2001年

关键词：

D O I：

10.1109/ICTAI.2001.974464

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The WEBSOM methodology for building very large text archives has a very, slow method for extracting meaningful unit labels. This is because the method computes for the relative frequencies of all the words of all the documents associated to each unit and then compares these to the relative frequencies of all the words of all the other units of the map. Since maps may have more than 100,000 units and the archive may contain tip to 7 million documents, the existing WEBSOM method is not practical. A fast alternative method is based on the distribution of weights in the weight vectors of the trained map, plus a simple manipulation of the random projection matrix used for input data compression. Comparisons made using a WEBSOM archive of the Reuters text collection reveal that a high percentage of keywords extracted using this method match the keywords extracted for the same map units using the original WEBSOM method.

引用

页码：187 / 194

页数：8

共 50 条

[21] The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction
Najafi, Elham
Darooneh, Amir H.
PLOS ONE, 2015, 10 (06):
[22] An Unsupervised Keyword Extraction Method based on Text Semantic Graph
Zhao, Liujun
Miao, Zhongquan
Wang, Chunming
Kong, Weizheng
2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 1431 - 1436
[23] Using citation networks to evaluate the impact of text length on keyword extraction
Tohalino, Jorge A. V.
Silva, Thiago C.
Amancio, Diego R.
PLOS ONE, 2023, 18 (11):
[24] Iterative Hard Thresholding for Keyword Extraction from Large Text Corpora
Yadlowsky, Steve
Nakkarin, Preetum
Wang, Jingyan
Sharma, Rishi
El Ghaoui, Laurent
2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 588 - 593
[25] Automatic Summarization and Keyword Extraction from Web Page or Text File
You, Xiangdong
2019 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY (CCET), 2019, : 154 - 158
[26] A Text Feature Based Automatic Keyword Extraction Method for Single Documents
Campos, Ricardo
Mangaravite, Vitor
Pasquali, Arian
Jorge, Alipio Mario
Nunes, Celia
Jatowt, Adam
ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 684 - 691
[27] Text Summarization with Automatic Keyword Extraction in Telugu e-Newspapers
Naidu, Reddy
Bharti, Santosh Kumar
Babu, Korra Sathya
Mohapatra, Ramesh Kumar
SMART COMPUTING AND INFORMATICS, 2018, 77 : 555 - 564
[28] A Feature Extraction Method Using Base Phrase and keyword In Chinese Text
Li, Xin-fu
Zhao, Lei-lei
Wu, Li-hong
2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 680 - +
[29] Incorporating keyword extraction and attention for multi-label text classification
Zhao, Hua
Li, Xiaoqian
Wang, Fengling
Zeng, Qingtian
Diao, Xiuli
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (02) : 2083 - 2093
[30] EFFICIENT KEYWORD EXTRACTION AND TEXT SUMMARIZATION FOR READING ARTICLES ON SMART PHONE
Jeong, Hyoungil
Ko, Youngjoong
Seo, Jungyun
COMPUTING AND INFORMATICS, 2015, 34 (04) : 779 - 794

← 1 2 3 4 5 →