Scaling up kernel grower clustering method for large data sets via core-sets

被引:0
|
作者
Chang, Liang [1 ]
Deng, Xiao-Ming [2 ,3 ]
Zheng, Sui-Wu [1 ]
Wang, Yong-Qing [1 ]
机构
[1] Key Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
[2] Virtual Reality Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China
[3] National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
来源
基金
中国国家自然科学基金;
关键词
Data mining - Data structures - Image segmentation - Pattern recognition - Self organizing maps;
D O I
10.3724/SP.J.1004.2008.00376
中图分类号
学科分类号
摘要
Kernel grower is a novel kernel clustering method proposed recently by Camastra and Verri. It shows good performance for various data sets and compares favorably with respect to popular clustering algorithms. However, the main drawback of the method is the weak scaling ability in dealing with large data sets, which restricts its application greatly. In this paper, we propose a scaled-up kernel grower method using core-sets, which is significantly faster than the original method for large data clustering. Meanwhile, it can deal with very large data sets. Numerical experiments on benchmark data sets as well as synthetic data sets show the efficiency of the proposed method. The method is also applied to real image segmentation to illustrate its performance.
引用
收藏
页码:376 / 382
相关论文
共 50 条
  • [41] A novel kernel possibitistic fuzzy c-means clustering algorithm for large scale data sets
    Qu, Yu
    Su, Hongye
    Zhang, Ying
    Chu, Jian
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 527 - 530
  • [42] Kernel Density Estimation, Kernel Methods, and Fast Learning in Large Data Sets
    Wang, Shitong
    Wang, Jun
    Chung, Fu-lai
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (01) : 1 - 20
  • [43] An Evolutionary Clustering Method for Arbitrary Shaped Data Sets
    Liu, Cong
    Wu, Chunxue
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 739 - 743
  • [44] Image-mapped data clustering: An efficient technique for clustering large data sets
    Al-Omari, Faruq
    Al-Fayoumi, Nabeel
    Al-Jarrah, Mohammad
    INTELLIGENT DATA ANALYSIS, 2008, 12 (06) : 573 - 586
  • [45] Measuring Similarity of Complex and Heterogeneous Data in Clustering of Large Data Sets
    Bacelar-Nicolau, Helena
    Nicolau, Fernando
    Sousa, Aurga
    Bacelar-Nicolau, Leonor
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2009, 29 (02) : 9 - 18
  • [46] Method for Data Analytics on Large Data Sets of Images
    Sathe, Riah
    Mense, Shraddha
    Pradhan, Shashwat
    Netraganti, Ankita
    Aghav, Jagannath
    2016 INTERNATIONAL CONFERENCE ON COMPUTING, ANALYTICS AND SECURITY TRENDS (CAST), 2016, : 335 - 340
  • [47] Super Large Data Sets Clustering by Means Radial Compression
    Lu Zhimao
    Liu Chen
    Zhang Qi
    Sambourou, Massinanke
    Fan Dongmei
    CHINESE JOURNAL OF ELECTRONICS, 2013, 22 (02): : 335 - 340
  • [48] Accelerated EM-based clustering of large data sets
    Jakob J. Verbeek
    Jan R. J. Nunnink
    Nikos Vlassis
    Data Mining and Knowledge Discovery, 2006, 13 : 291 - 307
  • [49] Determination of similarity threshold in clustering problems for large data sets
    Sánchez-Díaz, G
    Martínez-Trinidad, JF
    PROGRESS IN PATTERN RECOGNITION, SPEECH AND IMAGE ANALYSIS, 2003, 2905 : 611 - 618
  • [50] Selective sampling for approximate clustering of very large data sets
    Wang, Liang
    Bezdek, James C.
    Leckie, Christopher
    Kotagiri, Ramamohanarao
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2008, 23 (03) : 313 - 331