An Improved Affinity Propagation Clustering Algorithm for Large-scale Data Sets

被引:0
|
作者
Liu, Xiaonan [1 ]
Yin, Meijuan [1 ]
Luo, Junyong [1 ]
Chen, Wuping [2 ]
机构
[1] State Key Lab Math Engn & Adv Comp, Zhengzhou, Peoples R China
[2] Sci & Technol Informat Assurance Lab, Beijing, Peoples R China
关键词
Data clustering; affinity propagation; hierarchical; selection; clustering center;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Affinity Propagation (AP) clustering does not need to set the number of clusters, and has advantages on efficiency and accuracy, but is not suitable for large-scale data clustering. To ensure both a low time complexity and a good accuracy for the clustering method of affinity propagation on large-scale data clustering, an improved AP clustering algorithm named hierarchical affinity propagation (HAP) is proposed, which clusters data points by using AP algorithm several times on different level data. The data set to be clustered is firstly divided into several subsets, each of which can be efficiently clustered by AP algorithm. Then, the AP algorithm is performed on each subset to respectively select cluster centers of each subset. Further, AP clustering was again implemented on all the local cluster centers to select well-suited global exemplars of whole data set. Finally, to efficiently and accurately cluster data points in a large-scale, all the data points are clustered by the similarities between each data point and the global exemplars. The experimental results on real and simulated data sets show that, compared with the traditional AP and adaptive AP algorithm, the HAP algorithm can greatly reduce the clustering time consumption with a relatively better clustering results.
引用
收藏
页码:894 / 899
页数:6
相关论文
共 50 条
  • [1] CLUSTERING LARGE-SCALE DATA BASED ON MODIFIED AFFINITY PROPAGATION ALGORITHM
    Serdah, Ahmed M.
    Ashour, Wesam M.
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2016, 6 (01) : 23 - 33
  • [2] Affinity propagation clustering algorithm based on large-scale data-set
    Wang L.
    Zheng K.
    Tao X.
    Han X.
    International Journal of Computers and Applications, 2018, 40 (03) : 1 - 6
  • [3] Parallel Clustering Algorithm for Large-Scale Biological Data Sets
    Wang, Minchao
    Zhang, Wu
    Ding, Wang
    Dai, Dongbo
    Zhang, Huiran
    Xie, Hao
    Chen, Luonan
    Guo, Yike
    Xie, Jiang
    PLOS ONE, 2014, 9 (04):
  • [4] A fast hierarchical clustering algorithm for large-scale protein sequence data sets
    Szilagyi, Sandor M.
    Szilagyi, Laszlo
    COMPUTERS IN BIOLOGY AND MEDICINE, 2014, 48 : 94 - 101
  • [5] Improved qARM algorithm for frequent itemsets search in large-scale data sets
    Qi, Han
    Wang, Liyuan
    Fu, Dianshuo
    Gani, Abdullah
    Gong, Changqing
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (05):
  • [6] Privacy-preserving constrained spectral clustering algorithm for large-scale data sets
    Li, Ji
    Wei, Jianghong
    Ye, Mao
    Liu, Wenfen
    Hu, Xuexian
    IET INFORMATION SECURITY, 2020, 14 (03) : 321 - 331
  • [7] On Expanded and Improved Affinity Propagation Clustering Algorithm
    Chen Xinquan
    MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION, PTS 1 AND 2, 2011, 48-49 : 753 - 756
  • [9] Local and global approaches of affinity propagation clustering for large scale data
    Ding-yin Xia
    Fei Wu
    Xu-qing Zhang
    Yue-ting Zhuang
    Journal of Zhejiang University-SCIENCE A, 2008, 9 : 1373 - 1381
  • [10] Local and global approaches of affinity propagation clustering for large scale data
    Xia, Ding-yin
    Wu, Fei
    Zhang, Xu-qing
    Zhuang, Yue-ting
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2008, 9 (10): : 1373 - 1381