Estimating Generalized Dunn's Cluster Validity Indices for Big Data

被引:2
|
作者
Rathore, Punit [1 ]
Ghafoori, Zahra [2 ]
Bezdek, James C. [2 ]
Palaniswami, Marimuthu [1 ]
Leckie, Christopher [2 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Melbourne, Vic, Australia
[2] Univ Melbourne, Sch Comp & Informat Syst, Melbourne, Vic, Australia
关键词
D O I
10.1109/SMC.2018.00120
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Dunn's internal duster validity index and its generalizations assess partition quality. For partitions of n samples of p-dimensional feature vector data, all but two of the generalized Dunn's indices (GDIs) have quadratic time complexity O(pn(2)), so computation is untenable for very large values of n. In this paper, we present two methods for approximating GDIs based on Maximin (MM) Sampling. MM sampling identifies a skeleton of the full partition that usually contains some of the boundary points in each cluster which are used to compute GUIs. We compare our algorithms with a support vector machine-based boundary extraction method and a random-sampling-based estimation method. Experiments on four real and synthetic datasets show that computing approximations to (three) GDIs with the MM skeleton is both computationally tractable and reliably accurate.
引用
收藏
页码:656 / 661
页数:6
相关论文
共 50 条
  • [21] Relational Generalizations of Cluster Validity Indices
    Sledge, Isaac J.
    Bezdek, James C.
    Havens, Timothy C.
    Keller, James M.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (04) : 771 - 786
  • [22] A survey of cluster validity indices for automatic data clustering using differential evolution
    Jose-Garcia, Adan
    Gomez-Flores, Wilfrido
    PROCEEDINGS OF THE 2021 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'21), 2021, : 314 - 322
  • [23] A note on cluster validity indices SV and OS
    Chen, Guang Hui
    INDUSTRIAL INSTRUMENTATION AND CONTROL SYSTEMS II, PTS 1-3, 2013, 336-338 : 2199 - 2202
  • [24] An extensive comparative study of cluster validity indices
    Arbelaitz, Olatz
    Gurrutxaga, Ibai
    Muguerza, Javier
    Perez, Jesus M.
    Perona, Inigo
    PATTERN RECOGNITION, 2013, 46 (01) : 243 - 256
  • [25] Shape-invariant cluster validity indices
    Frederix, G
    Pauwels, EJ
    ADVANCES IN DATA MINING: APPLICATIONS IN IMAGE MINING, MEDICINE AND BIOTECHNOLOGY, MANAGEMENT AND ENVIRONMENTAL CONTROL, AND TELECOMMUNICATIONS, 2004, 3275 : 96 - 105
  • [26] Some connectivity based cluster validity indices
    Saha, Sriparna
    Bandyopadhyay, Sanghamitra
    APPLIED SOFT COMPUTING, 2012, 12 (05) : 1555 - 1565
  • [27] On monotonic tendency of some fuzzy cluster validity indices for high-dimensional data
    Eustaquio, Fernanda
    Nogueira, Tatiane
    2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2018, : 558 - 563
  • [28] Big Data clustering validity
    Tlili, Monia
    Hamdani, Tarek M.
    2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 348 - 352
  • [29] Exchangeable cluster binary data correlation coefficient estimation with generalized estimating equations
    Tsou, TS
    STATISTICS & PROBABILITY LETTERS, 2000, 50 (02) : 179 - 186
  • [30] Generalized Adjusted Rand Indices for cluster ensembles
    Zhang, Shaohong
    Wong, Hau-San
    Shen, Ying
    PATTERN RECOGNITION, 2012, 45 (06) : 2214 - 2226