Approximating Dunn's Cluster Validity Indices for Partitions of Big Data

被引:19
|
作者
Rathore, Punit [1 ]
Ghafoori, Zahra [2 ]
Bezdek, James C. [2 ]
Palaniswami, Marimuthu [1 ]
Leckie, Christopher [2 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Parkville, Vic 3051, Australia
[2] Univ Melbourne, Sch Comp & Informat Syst, Parkville, Vic 3051, Australia
关键词
Approximate Dunn's indices; big data; boundary point estimation; data skeleton; Dunn's index (DI); internal cluster validity; Maximin sampling; VALIDATION; NUMBER;
D O I
10.1109/TCYB.2018.2806886
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dunn's internal cluster validity index is used to assess partition quality and subsequently identify a "best" crisp partition of n objects. Computing Dunn's index (DI) for partitions of n p-dimensional feature vector data has quadratic time complexity O(pn(2)), so its computation is impractical for very large values of n. This note presents six methods for approximating DI. Four methods are based on Maximin sampling, which identifies a skeleton of the full partition that contains some boundary points in each cluster. Two additional methods are presented that estimate boundary points associated with unsupervised training of one class support vector machines. Numerical examples compare approximations to DI based on all six methods. Four experiments on seven real and synthetic data sets support our assertion that computing approximations to DI with an incremental, neighborhood-based Maximin skeleton is both tractable and reliably accurate.
引用
收藏
页码:1629 / 1641
页数:13
相关论文
共 50 条
  • [31] On monotonic tendency of some fuzzy cluster validity indices for high-dimensional data
    Eustaquio, Fernanda
    Nogueira, Tatiane
    2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2018, : 558 - 563
  • [32] Big Data clustering validity
    Tlili, Monia
    Hamdani, Tarek M.
    2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 348 - 352
  • [33] Parallel and scalable Dunn Index for the validation of big data clusters
    Ben Ncir, Chiheb-Eddine
    Hamza, Abdallah
    Bouaguel, Waad
    PARALLEL COMPUTING, 2021, 102
  • [34] Role of Cluster Validity Indices in Delineation of Precipitation Regions
    Bhatia, Nikhil
    Sojan, Jency M.
    Simonovic, Slobodon
    Srivastav, Roshan
    WATER, 2020, 12 (05)
  • [35] Two cluster validity indices for the LAMDA clustering method
    Botia Valderrama, Javier Fernando
    Luis Botia Valderrama, Diego Jose
    APPLIED SOFT COMPUTING, 2020, 89 (89)
  • [36] A new clustering algorithm based on cluster validity indices
    Kim, M
    Ramakrishna, RS
    DISCOVERY SCIENCE, PROCEEDINGS, 2004, 3245 : 322 - 329
  • [37] Cluster validity indices for mixture hazards regression models
    Chang, Yi-Wen
    Lu, Kang-Ping
    Chang, Shao-Tung
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (02) : 1616 - 1636
  • [38] Ground truth bias in external cluster validity indices
    Lei, Yang
    Bezdek, James C.
    Romano, Simone
    Nguyen Xuan Vinh
    Chan, Jeffrey
    Bailey, James
    PATTERN RECOGNITION, 2017, 65 : 58 - 70
  • [39] On Fuzzy Cluster Validity Indices for the Objects of Mixed Features
    Lee, Mahnhoon
    2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 2009, : 390 - 395
  • [40] New Evaluation Method for Fuzzy Cluster Validity Indices
    Perez-Sanchez, Ismay
    Medina-Perez, Miguel Angel
    Monroy, Raul
    Loyola-Gonzalez, Octavio
    Gutierrez-Rodriguez, Andres Eduardo
    IEEE ACCESS, 2025, 13 : 22728 - 22744