NMF based Dimension Reduction Methods for Turkish Text Clustering

被引:0
|
作者
Guran, Aysun [1 ]
Ganiz, Murat Can [1 ]
Naiboglu, Hamit Selahattin [1 ]
Kaptikacti, Halil Oguz [1 ]
机构
[1] Dogus Univ, Dept Comp Engn, TR-34722 Istanbul, Turkey
关键词
component; Turkish text clustering; k-means; dimension reduction; NMF; NMF based text summarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we analyze the effects of NMF based dimension reduction methods on clustering of Turkish documents by using k-means clustering algorithm. All experiments are conducted on two different datasets that we call Milliyet4c1k and 1150haber. The NMF based dimension reduction methods have two purposes: to reduce the original vector space by transformation and to reduce size and dimension by summarizing original documents. Experimental results show that NMF transformation yields to better clustering results on both datasets. Using k-means on summarized documents produces almost identical result with k-means on original documents. Although using summaries instead of full documents doesn't improve quality of clustering, we show that it significantly reduces the size of the processed data and execution time of k-means clustering algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] NMF-based Method of Text Classification
    Sun, Fuzhen
    Zhang, Kun
    2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2010, : 4312 - 4316
  • [22] Visual clustering of complex network based on nonlinear dimension reduction
    Li, Jianyu
    Yang, Shuzhong
    INTELLIGENT INFORMATION PROCESSING III, 2006, 228 : 555 - +
  • [23] Ensemble dimension reduction based on spectral disturbance for subspace clustering
    Chen, Xiaoyun
    Wang, Qiaoping
    Zhuang, Shanshan
    KNOWLEDGE-BASED SYSTEMS, 2021, 227
  • [24] Cluster-preserving dimension reduction methods for efficient classification of text data
    Howland, P
    Park, H
    SURVEY OF TEXT MINING: CLUSTERING, CLASSIFICATION, AND RETRIEVAL, 2004, : 3 - 23
  • [25] Supervised Clustering of Persian Handwritten Images Using Regularization and Dimension Reduction Methods
    Moradnia, Sajedeh
    Golalizadeh, Mousa
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (05)
  • [26] Dimensional Reduction Methods Comparison for Clustering Results of Indonesian Language Text Documents
    Hasanah, Siti Inayah Rizki
    Jambak, Muhammad Ihsan
    Saputra, Danny M.
    INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, ISDA 2021, 2022, 418 : 1271 - 1280
  • [27] Evaluating Partitioning Based Clustering Methods for Extended Non-negative Matrix Factorization (NMF)
    Bhandari, Neetika
    Pahwa, Payal
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 2043 - 2055
  • [28] Clustering and dimension reduction for mixed variables
    Vichi M.
    Vicari D.
    Kiers H.A.L.
    Behaviormetrika, 2019, 46 (2) : 243 - 269
  • [29] Knowledge Driven Dimension Reduction For Clustering
    Davidson, Ian
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1034 - 1039
  • [30] On hierarchical clustering in sufficient dimension reduction
    Yoo, Chaeyeon
    Yoo, Younju
    Um, Hye Yeon
    Yoo, Jae Keun
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2020, 27 (04) : 431 - 443