Hierarchical Clustering Using Non-Greedy Principal Direction Divisive Partitioning

被引:0
|
作者
Martin Nilsson
机构
[1] Los Alamos National Laboratory,
来源
Information Retrieval | 2002年 / 5卷
关键词
clustering; taxonomy; PCA; classification;
D O I
暂无
中图分类号
学科分类号
摘要
We present a non-greedy version of the recently published Principal Direction Divisive Partitioning (PDDP) algorithm. The PDDP algorithm creates a hierarchical taxonomy of a data set by successively splitting the data into sub-clusters. At each level the cluster with largest variance is split by a hyper-plane orthogonal to its leading principal component. The PDDP algorithm is known to produce high quality clusters, especially when applied to high dimensional data, such as document-word feature matrices. It also scales well with both the size and the dimensionality of the data set. However, at each level only the locally optimal choice of spitting is considered. At a later stage this often leads to a non-optimal global partitioning of the data. The non-greedy version of the PDDP algorithm (NGPDDP) presented in this paper address this problem. At each level multiple alternative splitting strategies are considered. Results from applying the algorithm to generated and real data (feature vectors from sets of text documents) are presented. The results show substantial improvements in the cluster quality.
引用
收藏
页码:311 / 321
页数:10
相关论文
共 50 条
  • [1] Hierarchical clustering using non-greedy principal direction divisive partitioning
    Nilsson, M
    INFORMATION RETRIEVAL, 2002, 5 (04): : 311 - 321
  • [2] Principal Direction Divisive Partitioning
    Daniel Boley
    Data Mining and Knowledge Discovery, 1998, 2 : 325 - 344
  • [3] Principal direction divisive partitioning
    Boley, D
    DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (04) : 325 - 344
  • [4] Evolutionary Principal Direction Divisive Partitioning
    Tasoulis, Sotiris K.
    Tasoulis, Dimitris K.
    Plagianakos, Vassilis P.
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [5] Enhancing principal direction divisive clustering
    Tasoulis, S. K.
    Tasoulis, D. K.
    Plagianakos, V. P.
    PATTERN RECOGNITION, 2010, 43 (10) : 3391 - 3411
  • [6] Error analysis of automatic speech recognition using Principal Direction Divisive Partitioning
    McKoskey, D
    Boley, D
    MACHINE LEARNING: ECML 2000, 2000, 1810 : 263 - 270
  • [7] A NON-GREEDY APPROACH TO TREE-STRUCTURED CLUSTERING
    MILLER, D
    ROSE, K
    PATTERN RECOGNITION LETTERS, 1994, 15 (07) : 683 - 690
  • [8] Principal direction divisive partitioning with kernels and k-means steering
    Zeimpekis, Dimitrios
    Gallopoulos, Efstratios
    SURVEY OF TEXT MINING II: CLUSTERING, CLASSIFICATION, AND RETRIEVAL, 2008, : 45 - 64
  • [9] Non-Greedy L21-Norm Maximization for Principal Component Analysis
    Nie, Feiping
    Tian, Lai
    Huang, Heng
    Ding, Chris
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5277 - 5286
  • [10] Diagonal principal component analysis with non-greedy l1-norm maximization for face recognition
    Yu, Qiang
    Wang, Rong
    Yang, Xiaojun
    Li, Bing Nan
    Yao, Minli
    NEUROCOMPUTING, 2016, 171 : 57 - 62