Approximating Dunn's Cluster Validity Indices for Partitions of Big Data

被引:19
|
作者
Rathore, Punit [1 ]
Ghafoori, Zahra [2 ]
Bezdek, James C. [2 ]
Palaniswami, Marimuthu [1 ]
Leckie, Christopher [2 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Parkville, Vic 3051, Australia
[2] Univ Melbourne, Sch Comp & Informat Syst, Parkville, Vic 3051, Australia
关键词
Approximate Dunn's indices; big data; boundary point estimation; data skeleton; Dunn's index (DI); internal cluster validity; Maximin sampling; VALIDATION; NUMBER;
D O I
10.1109/TCYB.2018.2806886
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dunn's internal cluster validity index is used to assess partition quality and subsequently identify a "best" crisp partition of n objects. Computing Dunn's index (DI) for partitions of n p-dimensional feature vector data has quadratic time complexity O(pn(2)), so its computation is impractical for very large values of n. This note presents six methods for approximating DI. Four methods are based on Maximin sampling, which identifies a skeleton of the full partition that contains some boundary points in each cluster. Two additional methods are presented that estimate boundary points associated with unsupervised training of one class support vector machines. Numerical examples compare approximations to DI based on all six methods. Four experiments on seven real and synthetic data sets support our assertion that computing approximations to DI with an incremental, neighborhood-based Maximin skeleton is both tractable and reliably accurate.
引用
收藏
页码:1629 / 1641
页数:13
相关论文
共 50 条
  • [1] Estimating Generalized Dunn's Cluster Validity Indices for Big Data
    Rathore, Punit
    Ghafoori, Zahra
    Bezdek, James C.
    Palaniswami, Marimuthu
    Leckie, Christopher
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 656 - 661
  • [2] An Approach to Silhouette and Dunn Clustering Indices Applied to Big Data in Spark
    Maria Luna-Romera, Jose
    del Mar Martinez-Ballesteros, Maria
    Garcia-Gutierrez, Jorge
    Riquelme-Santos, Jose C.
    ADVANCES IN ARTIFICIAL INTELLIGENCE, CAEPIA 2016, 2016, 9868 : 160 - 169
  • [3] Cluster parity indices of partitions
    Kağan Kurşungöz
    The Ramanujan Journal, 2010, 23 : 195 - 213
  • [4] A Data Clustering Tool with Cluster Validity Indices
    Qiao, Haiyan
    Edwards, Brandon
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTING, ENGINEERING AND INFORMATION, 2009, : 303 - 309
  • [5] Cluster parity indices of partitions
    Kursungoez, Kagan
    RAMANUJAN JOURNAL, 2010, 23 (1-3): : 195 - 213
  • [6] An approach to validity indices for clustering techniques in Big Data
    Luna-Romera J.M.
    García-Gutiérrez J.
    Martínez-Ballesteros M.
    Riquelme Santos J.C.
    Progress in Artificial Intelligence, 2018, 7 (2) : 81 - 94
  • [7] Incremental Cluster Validity Indices for Online Learning of Hard Partitions: Extensions and Comparative Study
    Brito Da Silva, Leonardo Enzo
    Melton, Niklas Max
    Wunsch, Donald C.
    IEEE ACCESS, 2020, 8 : 22025 - 22047
  • [8] Incremental Cluster Validity Indices for Online Learning of Hard Partitions: Extensions and Comparative Study
    Brito Da Silva, Leonardo Enzo
    Melton, Niklas Max
    Wunsch, Donald C.
    IEEE Access, 2020, 8 : 22025 - 22047
  • [9] Dunn's Cluster Validity Index as a Contrast Measure of VAT Images
    Havens, Timothy C.
    Bezdek, James C.
    Keller, James M.
    Popescu, Mihail
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2304 - 2307
  • [10] Modified Dunn's cluster validity index based on graph theory
    Ilc, Nejc
    PRZEGLAD ELEKTROTECHNICZNY, 2012, 88 (02): : 126 - 131