An Improved Affinity Propagation Clustering Algorithm for Large-scale Data Sets

被引:0
|
作者
Liu, Xiaonan [1 ]
Yin, Meijuan [1 ]
Luo, Junyong [1 ]
Chen, Wuping [2 ]
机构
[1] State Key Lab Math Engn & Adv Comp, Zhengzhou, Peoples R China
[2] Sci & Technol Informat Assurance Lab, Beijing, Peoples R China
关键词
Data clustering; affinity propagation; hierarchical; selection; clustering center;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Affinity Propagation (AP) clustering does not need to set the number of clusters, and has advantages on efficiency and accuracy, but is not suitable for large-scale data clustering. To ensure both a low time complexity and a good accuracy for the clustering method of affinity propagation on large-scale data clustering, an improved AP clustering algorithm named hierarchical affinity propagation (HAP) is proposed, which clusters data points by using AP algorithm several times on different level data. The data set to be clustered is firstly divided into several subsets, each of which can be efficiently clustered by AP algorithm. Then, the AP algorithm is performed on each subset to respectively select cluster centers of each subset. Further, AP clustering was again implemented on all the local cluster centers to select well-suited global exemplars of whole data set. Finally, to efficiently and accurately cluster data points in a large-scale, all the data points are clustered by the similarities between each data point and the global exemplars. The experimental results on real and simulated data sets show that, compared with the traditional AP and adaptive AP algorithm, the HAP algorithm can greatly reduce the clustering time consumption with a relatively better clustering results.
引用
收藏
页码:894 / 899
页数:6
相关论文
共 50 条
  • [21] Fuzzy clustering algorithm based on multiple medoids for large-scale data
    Chen A.-G.
    Wang S.-T.
    Kongzhi yu Juece/Control and Decision, 2016, 31 (12): : 2122 - 2130
  • [22] Large-scale parallel data clustering
    Judd, D
    McKinley, PK
    Jain, AK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (08) : 871 - 876
  • [23] An Improved Multi-Pattern Matching Algorithm for Large-Scale Pattern Sets
    Peng, Zhan
    Wang, Yuping
    Xue, Jinfeng
    2014 TENTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2014, : 197 - 200
  • [24] AN IMPROVED SPECTRAL CLUSTERING ALGORITHM FOR LARGE-SCALE WIND FARM POWER PREDICTION
    Qiang, Baohua
    Zhao, Tian
    Xie, Wu
    Zheng, Hong
    Sun, Haoning
    Chen, Jinlong
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2019, 34 (05): : 553 - 562
  • [25] An improved clustering method for large-scale data based on artificial immune system
    Li, Zhonghua
    Tan, Hongzhou
    Yan, Xiaoke
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13 : 920 - 924
  • [26] A fast algorithm for learning a ranking function from large-scale data sets
    Raykar, Vikas C.
    Duraiswami, Ramani
    Krishnapuram, Balaji
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (07) : 1158 - 1170
  • [27] An Improved Algorithm for Data Gathering in Large-Scale Wireless Sensor Networks
    Jawhar, Quosain
    Thakur, Khushal
    PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 141 - 151
  • [28] A Parallel Affinity Propagation Clustering Algorithm in Biological Data Processing
    Wang, Minchao
    Zhang, Wu
    Dai, Dongbo
    Zhang, Huiran
    Xie, Jiang
    2014 INTERNATIONAL CONFERENCE ON BIOLOGICAL ENGINEERING AND BIOMEDICAL (BEAB 2014), 2014, : 248 - 254
  • [29] Data Stream Clustering Algorithm Based on Affinity Propagation and Density
    Li Yang
    Tan Baihong
    MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 444 - 449
  • [30] MapReduce-based Dragonfly Algorithm for large-scale Data-Clustering
    Tripathi, Ashish Kumar
    Saxena, Pranav
    Gupta, Siddharth
    2019 FIFTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP 2019), 2019, : 171 - 175