Hierarchical Clustering Using Non-Greedy Principal Direction Divisive Partitioning

被引:0
|
作者
Martin Nilsson
机构
[1] Los Alamos National Laboratory,
来源
Information Retrieval | 2002年 / 5卷
关键词
clustering; taxonomy; PCA; classification;
D O I
暂无
中图分类号
学科分类号
摘要
We present a non-greedy version of the recently published Principal Direction Divisive Partitioning (PDDP) algorithm. The PDDP algorithm creates a hierarchical taxonomy of a data set by successively splitting the data into sub-clusters. At each level the cluster with largest variance is split by a hyper-plane orthogonal to its leading principal component. The PDDP algorithm is known to produce high quality clusters, especially when applied to high dimensional data, such as document-word feature matrices. It also scales well with both the size and the dimensionality of the data set. However, at each level only the locally optimal choice of spitting is considered. At a later stage this often leads to a non-optimal global partitioning of the data. The non-greedy version of the PDDP algorithm (NGPDDP) presented in this paper address this problem. At each level multiple alternative splitting strategies are considered. Results from applying the algorithm to generated and real data (feature vectors from sets of text documents) are presented. The results show substantial improvements in the cluster quality.
引用
收藏
页码:311 / 321
页数:10
相关论文
共 50 条
  • [41] Non-linear correlation analysis in financial markets using hierarchical clustering
    Salgado-Hernandez, J. E.
    Vyas, Manan
    JOURNAL OF PHYSICS COMMUNICATIONS, 2023, 7 (05):
  • [42] Scalable non-linear Support Vector Machine using hierarchical clustering
    Asharaf, S.
    Shevade, S. K.
    Murty, M. Narasimha
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 908 - +
  • [43] Categorization of non-B, non-C hepatocellular carcinoma patients using hierarchical clustering
    Tateishi, Ryosuke
    Okanoue, Takeshi
    Fujiwara, Naoto
    Okita, Kiwamu
    Kiyosawa, Kendo
    Omata, Masao
    Kumada, Hiromitsu
    Hayashi, Norio
    Koike, Kazuhiko
    HEPATOLOGY, 2013, 58 : 1243A - 1244A
  • [44] Hierarchical clustering of non-Euclidean relational data using indiscernibility-level
    Hirano, Shoji
    Tsumoto, Shusaku
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2008, 5009 : 332 - 339
  • [45] Hierarchical k-means clustering using principal components to solve the unsupervised multi-class classification problem
    Rathman, JF
    Mohiddin, SB
    Yang, C
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2005, 230 : U1007 - U1007
  • [46] Streamlined approaches for image classification using principal component analysis and hierarchical clustering of extrudates from coffee and sorghum blends
    Chavez, Davy William Hidalgo
    Da Silva, Felipe Leite Coelho
    Pinto, Renan Vicente
    De Carvalho, Carlos Wanderley Piler
    Freitas-Silva, Otniel
    CYTA-JOURNAL OF FOOD, 2023, 21 (01) : 606 - 613
  • [47] Microglia morphotyping in the adult mouse CNS using hierarchical clustering on principal components reveals regional heterogeneity but no sexual dimorphism
    van Weering, Hilmar R. J.
    Nijboer, Tjalling W.
    Brummer, Maaike L.
    Boddeke, Erik W. G. M.
    Eggen, Bart J. L.
    GLIA, 2023, 71 (10) : 2356 - 2371
  • [48] Development of composites based on recycled polypropylene for injection moulding automobile parts using hierarchical clustering analysis and principal component estimate
    Gu, Fu
    Hall, Philip
    Miles, N. J.
    JOURNAL OF CLEANER PRODUCTION, 2016, 137 : 632 - 643
  • [49] Finding specific peaks (markers) using fuzzy divisive hierarchical associative-clustering based on the chromatographic profiles of medicinal plant extracts obtained at various detection wavelengths
    Simion, Ileana M.
    Mot, Augustin-C
    Sarbu, Costel
    ANALYTICAL METHODS, 2020, 12 (25) : 3260 - 3267
  • [50] Noise Pollution Analysis Using Geographic Information System, Agglomerative Hierarchical Clustering and Principal Component Analysis in Urban Sustainability (Case Study: Tehran)
    Forouhid, Amir Esmael
    Khosravi, Shahrzad
    Mahmoudi, Jafar
    SUSTAINABILITY, 2023, 15 (03)