An Improved Affinity Propagation Clustering Algorithm for Large-scale Data Sets

被引：0

作者：

Liu, Xiaonan ^{[1
]}

Yin, Meijuan ^{[1
]}

Luo, Junyong ^{[1
]}

Chen, Wuping ^{[2
]}

机构：

[1] State Key Lab Math Engn & Adv Comp, Zhengzhou, Peoples R China

[2] Sci & Technol Informat Assurance Lab, Beijing, Peoples R China

来源：

2013 NINTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC) | 2013年

关键词：

Data clustering; affinity propagation; hierarchical; selection; clustering center;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Affinity Propagation (AP) clustering does not need to set the number of clusters, and has advantages on efficiency and accuracy, but is not suitable for large-scale data clustering. To ensure both a low time complexity and a good accuracy for the clustering method of affinity propagation on large-scale data clustering, an improved AP clustering algorithm named hierarchical affinity propagation (HAP) is proposed, which clusters data points by using AP algorithm several times on different level data. The data set to be clustered is firstly divided into several subsets, each of which can be efficiently clustered by AP algorithm. Then, the AP algorithm is performed on each subset to respectively select cluster centers of each subset. Further, AP clustering was again implemented on all the local cluster centers to select well-suited global exemplars of whole data set. Finally, to efficiently and accurately cluster data points in a large-scale, all the data points are clustered by the similarities between each data point and the global exemplars. The experimental results on real and simulated data sets show that, compared with the traditional AP and adaptive AP algorithm, the HAP algorithm can greatly reduce the clustering time consumption with a relatively better clustering results.

引用

页码：894 / 899

页数：6

共 50 条

[31] Large-Scale Data Clustering Algorithm Based on Quantum Immune Regulation Network
Li, Yangyang
Bai, Xiaoyu
Hou, Xiaoju
Jiao, Licheng
2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017,
[32] A Sampling-Based Density Peaks Clustering Algorithm for Large-Scale Data
Ding, Shifei
Li, Chao
Xu, Xiao
Ding, Ling
Zhang, Jian
Guo, Lili
Shi, Tianhao
PATTERN RECOGNITION, 2023, 136
[33] PurTreeClust: A Purchase Tree Clustering Algorithm for Large-scale Customer Transaction Data
Chen, Xiaojun
Huang, Joshua Zhexue
Luo, Jun
2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 661 - 672
[34] Improved Semi-supervised Clustering Algorithm Based on Affinity Propagation
金冉
刘瑞娟
李晔锋
寇春海
Journal of Donghua University(English Edition), 2015, 32 (01) : 125 - 131
[35] A genetic algorithm for clustering on very large data sets
Gasvoda, J
Ding, Q
COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 163 - 167
[36] A Genetic Algorithm Approach for Clustering Large Data Sets
Luchi, Diego
Rodrigues, Alexandre
Varejao, Flavio Miguel
Santos, Willian
2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 570 - 576
[37] A study of large-scale data clustering based on fuzzy clustering
Li, Yangyang
Yang, Guoli
He, Haiyang
Jiao, Licheng
Shang, Ronghua
SOFT COMPUTING, 2016, 20 (08) : 3231 - 3242
[38] A study of large-scale data clustering based on fuzzy clustering
Yangyang Li
Guoli Yang
Haiyang He
Licheng Jiao
Ronghua Shang
Soft Computing, 2016, 20 : 3231 - 3242
[39] An optimizing clustering algorithm for large-scale mobile network
Tian, YC
Guoi, W
Ren, QC
2002 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS AND WEST SINO EXPOSITION PROCEEDINGS, VOLS 1-4, 2002, : 155 - 159
[40] Algorithm for large-scale clustering across multiple genomes
Yi, Gangman
Jung, Jaehee
BIOINFORMATION, 2011, 7 (05) : 251 - 255

← 1 2 3 4 5 →