NMF based Dimension Reduction Methods for Turkish Text Clustering

被引:0
|
作者
Guran, Aysun [1 ]
Ganiz, Murat Can [1 ]
Naiboglu, Hamit Selahattin [1 ]
Kaptikacti, Halil Oguz [1 ]
机构
[1] Dogus Univ, Dept Comp Engn, TR-34722 Istanbul, Turkey
关键词
component; Turkish text clustering; k-means; dimension reduction; NMF; NMF based text summarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we analyze the effects of NMF based dimension reduction methods on clustering of Turkish documents by using k-means clustering algorithm. All experiments are conducted on two different datasets that we call Milliyet4c1k and 1150haber. The NMF based dimension reduction methods have two purposes: to reduce the original vector space by transformation and to reduce size and dimension by summarizing original documents. Experimental results show that NMF transformation yields to better clustering results on both datasets. Using k-means on summarized documents produces almost identical result with k-means on original documents. Although using summaries instead of full documents doesn't improve quality of clustering, we show that it significantly reduces the size of the processed data and execution time of k-means clustering algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Fast Dimension Reduction Based on NMF
    Kroemer, Pavel
    Platos, Jan
    Snasel, Vaclav
    ADVANCES IN COMPUTATION AND INTELLIGENCE, 2010, 6382 : 424 - 433
  • [2] Simultaneous dimension reduction and clustering via the NMF-EM algorithm
    Carel, Lena
    Alquier, Pierre
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2021, 15 (01) : 231 - 260
  • [3] Simultaneous dimension reduction and clustering via the NMF-EM algorithm
    Léna Carel
    Pierre Alquier
    Advances in Data Analysis and Classification, 2021, 15 : 231 - 260
  • [4] Document representation and dimension reduction for text clustering
    Shafiei, Mahdi
    Wang, Singer
    Zhang, Roger
    Milios, Evangelos
    Tang, Bin
    Tougas, Jane
    Spiteri, Ray
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2, 2007, : 770 - 779
  • [5] Research on NMF based Hierarchical Clustering Methods
    Li Fang
    Zhu Qunxiong
    ADVANCED MEASUREMENT AND TEST, PARTS 1 AND 2, 2010, 439-440 : 1306 - +
  • [6] Text Document Preprocessing and Dimension Reduction Techniques for Text Document Clustering
    Kadhim, Ammar Ismael
    Cheah, Yu-N
    Ahamed, Nurul Hashimah
    PROCEEDINGS 2014 4TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE WITH APPLICATIONS IN ENGINEERING AND TECHNOLOGY ICAIET 2014, 2014, : 69 - 73
  • [7] An effective dimension reduction algorithm for clustering Arabic text
    Mohamed, A. A.
    EGYPTIAN INFORMATICS JOURNAL, 2020, 21 (01) : 1 - 5
  • [8] Testing Dimension Reduction Methods for Text Retrieval
    Moravec, Pavel
    DATESO 2005 - DATABASES, TEXTS, SPECIFICATIONS, OBJECTS, 2005, : 113 - 124
  • [9] A Two-Stage Unsupervised Dimension Reduction Method for Text Clustering
    Bharti, Kusum Kumari
    Singh, Pramod Kumar
    PROCEEDINGS OF SEVENTH INTERNATIONAL CONFERENCE ON BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS (BIC-TA 2012), VOL 2, 2013, 202 : 529 - 542
  • [10] A three-stage unsupervised dimension reduction method for text clustering
    Bharti, Kusum Kumari
    Singh, P. K.
    JOURNAL OF COMPUTATIONAL SCIENCE, 2014, 5 (02) : 156 - 169