Estimating Generalized Dunn's Cluster Validity Indices for Big Data

被引:2
|
作者
Rathore, Punit [1 ]
Ghafoori, Zahra [2 ]
Bezdek, James C. [2 ]
Palaniswami, Marimuthu [1 ]
Leckie, Christopher [2 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Melbourne, Vic, Australia
[2] Univ Melbourne, Sch Comp & Informat Syst, Melbourne, Vic, Australia
关键词
D O I
10.1109/SMC.2018.00120
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Dunn's internal duster validity index and its generalizations assess partition quality. For partitions of n samples of p-dimensional feature vector data, all but two of the generalized Dunn's indices (GDIs) have quadratic time complexity O(pn(2)), so computation is untenable for very large values of n. In this paper, we present two methods for approximating GDIs based on Maximin (MM) Sampling. MM sampling identifies a skeleton of the full partition that usually contains some of the boundary points in each cluster which are used to compute GUIs. We compare our algorithms with a support vector machine-based boundary extraction method and a random-sampling-based estimation method. Experiments on four real and synthetic datasets show that computing approximations to (three) GDIs with the MM skeleton is both computationally tractable and reliably accurate.
引用
收藏
页码:656 / 661
页数:6
相关论文
共 50 条
  • [1] Approximating Dunn's Cluster Validity Indices for Partitions of Big Data
    Rathore, Punit
    Ghafoori, Zahra
    Bezdek, James C.
    Palaniswami, Marimuthu
    Leckie, Christopher
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (05) : 1629 - 1641
  • [2] An Approach to Silhouette and Dunn Clustering Indices Applied to Big Data in Spark
    Maria Luna-Romera, Jose
    del Mar Martinez-Ballesteros, Maria
    Garcia-Gutierrez, Jorge
    Riquelme-Santos, Jose C.
    ADVANCES IN ARTIFICIAL INTELLIGENCE, CAEPIA 2016, 2016, 9868 : 160 - 169
  • [3] A Data Clustering Tool with Cluster Validity Indices
    Qiao, Haiyan
    Edwards, Brandon
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTING, ENGINEERING AND INFORMATION, 2009, : 303 - 309
  • [4] Generalized Information Theoretic Cluster Validity Indices for Soft Clusterings
    Lei, Yang
    Bezdek, James C.
    Chan, Jeffrey
    Nguyen Xuan Vinh
    Romano, Simone
    Bailey, James
    2014 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2014, : 24 - 31
  • [5] An approach to validity indices for clustering techniques in Big Data
    Luna-Romera J.M.
    García-Gutiérrez J.
    Martínez-Ballesteros M.
    Riquelme Santos J.C.
    Progress in Artificial Intelligence, 2018, 7 (2) : 81 - 94
  • [6] Dunn's Cluster Validity Index as a Contrast Measure of VAT Images
    Havens, Timothy C.
    Bezdek, James C.
    Keller, James M.
    Popescu, Mihail
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2304 - 2307
  • [7] Modified Dunn's cluster validity index based on graph theory
    Ilc, Nejc
    PRZEGLAD ELEKTROTECHNICZNY, 2012, 88 (02): : 126 - 131
  • [8] Performance of eight cluster validity indices on hyperspectral data
    Fontán, FM
    Jiménez, LO
    ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY X, 2004, 5425 : 147 - 158
  • [9] Generalized Possibilistic Fuzzy C-Means with novel cluster validity indices for clustering noisy data
    Askari, S.
    Montazerin, N.
    Zarandi, M. H. Fazel
    APPLIED SOFT COMPUTING, 2017, 53 : 262 - 283
  • [10] Analysis of Incremental Cluster Validity for Big Data Applications
    Ibrahim, Omar A.
    Wang, Yiqing
    Keller, James M.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2018, 26 : 47 - 62