ON EXPONENTIALLY CONSISTENCY OF LINKAGE-BASED HIERARCHICAL CLUSTERING ALGORITHM USING KOLMOGROV-SMIRNOV DISTANCE

被引:0
|
作者
Wang, Tiexing [1 ]
Liu, Yang [1 ]
Chen, Biao [1 ]
机构
[1] Syracuse Univ, Dept EECS, Syracuse, NY 13244 USA
基金
美国国家科学基金会;
关键词
Kolmogorov-Smirnov distance; clustering; exponential consistency; probability of error; hierarchical clustering algorithm; EFFICIENT;
D O I
10.1109/icassp40776.2020.9053708
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper focuses on performance analysis of linkage-based hierarchical agglomerative clustering algorithms for sequence clustering using the Kolmogrov-Smirnov distance. Data sequences are assumed to be generated from unknown continuous distributions. The goal is to group the data sequences whose underlying generative distributions belong to one cluster without a priori knowledge of both the underlying distributions as well as the number of clusters. Upper bounds on the clustering error probability are derived. The upper bounds help establish the fact that the error probability decays exponentially fast as the sequence length goes to infinity and the obtained error exponent bound has a simple form. Tighter upper bounds on the error probability of single-linkage and complete-linkage algorithms are derived by taking advantage of the simplified metric updating for these two special cases. Simulation results are provided to validate the analysis.
引用
收藏
页码:3997 / 4001
页数:5
相关论文
共 34 条
  • [1] EXPONENTIALLY CONSISTENT K-MEANS CLUSTERING ALGORITHM BASED ON KOLMOGROV-SMIRNOV TEST
    Wang, Tiexing
    Bucci, Donald J., Jr.
    Liang, Yingbin
    Chen, Biao
    Varshney, Pramod K.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2296 - 2300
  • [2] A Characterization of Linkage-Based Hierarchical Clustering
    Ackerman, Margareta
    Ben-David, Shai
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [4] Gromov-Hausdorff stability of linkage-based hierarchical clustering methods
    Martinez-Perez, A.
    ADVANCES IN MATHEMATICS, 2015, 279 : 234 - 262
  • [5] Fuzzy distance based hierarchical clustering calculated using the A* algorithm
    Gedda, Magnus
    Svensson, Stina
    COMBINATORIAL IMAGE ANALYSIS, PROCEEDINGS, 2006, 4040 : 101 - 115
  • [6] Ping-pong Document Clustering using NMF and Linkage-Based Refinement
    Shinnou, Hiroyuki
    Sasaki, Minoru
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 107 - 112
  • [7] An agglomerative hierarchical clustering algorithm based on global distance measurement
    Liu, Fang
    Wei, Yongqing
    Ren, Min
    Hou, Xiuyan
    Liu, Yingying
    2015 7TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME), 2015, : 363 - 367
  • [8] Page clustering using a distance based algorithm
    Mojica, JA
    Rojas, DA
    Gómez, J
    González, F
    THIRD LATIN AMERICAN WEB CONGRESS, PROCEEDINGS, 2005, : 223 - 229
  • [9] A Genetic Algorithm Based Clustering Using Geodesic Distance Measure
    Li, Gang
    Zhuang, Jian
    Hou, Hongning
    Yu, Dehong
    2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 1, 2009, : 274 - 278
  • [10] Distance and density based clustering algorithm using Gaussian kernel
    Gungor, Emre
    Ozmen, Ahmet
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 69 : 10 - 20