Document Clustering Using Incremental and Pairwise Approaches

被引:0
|
作者
Tran, Tien [1 ]
Nayak, Richi [1 ]
Bruza, Peter [1 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld 4001, Australia
来源
关键词
Clustering; structure; content; XML; INEX; 2007;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents the experiments and results of a clustering approach for clustering of the large Wikipedia dataset in the INEX 2007 Document Mining Challenge. The clustering approach employed makes use of an incremental clustering method and a pairwise clustering method. The approach enables us to perform the clustering task on a large dataset by first reducing the dimension of the dataset to an undefined number of clusters using the incremental method. The lower-dimension dataset is then clustered to a required number of clusters using the pairwise method. In this way, clustering of the large number of documents is performed successfully and the accuracy of the clustering solution is achieved.
引用
收藏
页码:222 / 233
页数:12
相关论文
共 50 条
  • [21] Pairwise Probabilistic Clustering Using Evidence Accumulation
    Bulo, Samuel Rota
    Lourenco, Andre
    Fred, Ana
    Pelillo, Marcello
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2010, 6218 : 395 - +
  • [22] Using clustering for document reconstruction
    Ukovich, Anna
    Zacchigna, Alessandra
    Ramponi, Giovanni
    Schoier, Gabriella
    IMAGE PROCESSING: ALGORITHMS AND SYSTEMS, NEURAL NETWORKS, AND MACHINE LEARNING, 2006, 6064
  • [23] Incremental Clustering for Categorical Data Using Clustering Ensemble
    Li Taoying
    Chne Yan
    Qu Lili
    Mu Xiangwei
    PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 2519 - 2524
  • [24] INCREMENTAL CLUSTERING FOR VERY LARGE DOCUMENT DATABASES - INITIAL MARIAN EXPERIENCE
    CAN, F
    FOX, EA
    SNAVELY, CD
    FRANCE, RK
    INFORMATION SCIENCES, 1995, 84 (1-2) : 101 - 114
  • [25] Web document clustering using Document Index Graph
    Momin, B. F.
    Kulkarni, P. J.
    Chaudhari, Amol
    2006 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, VOLS 1 AND 2, 2007, : 30 - 35
  • [26] Pairwise constraints-guided non-negative matrix factorization for document clustering
    Yang, Yu-Jiu
    Hu, Bao-Gang
    PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, : 250 - +
  • [27] Efficient Clustering Approach using Incremental and Hierarchical Clustering Methods
    Srinivas, M.
    Mohan, C. Krishna
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [28] Pairwise clustering using a Monte Carlo Markov Chain
    Stosic, Borko D.
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2009, 388 (12) : 2373 - 2382
  • [29] Pairwise data clustering using monotone game dynamics
    Pavan, M
    Pelillo, M
    AI(ASTERISK)IA 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, 2829 : 201 - 212
  • [30] Refinement of Document Clustering by Using NMF
    Shinnou, Hiroyuki
    Sasaki, Minoru
    PACLIC 21: THE 21ST PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, PROCEEDINGS, 2007, : 430 - 439