A boosted clustering algorithm for distributed homogeneous data mining

被引:0
|
作者
Li, Chengan [1 ]
Wu, Tiejun [1 ]
机构
[1] Zhejiang Univ, Inst Intelligent Syst & Decis Making, Hangzhou 310027, Zhejiang, Peoples R China
来源
WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS | 2006年
关键词
cluster ensembles; distributed clustering; unsupervised learning; boosting strategy; partition schemes;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new distributed clustering algorithm based on boosting techniques is present to efficiently integrate multiple partitions constructed over very large and distributed homogeneous databases that cannot be merged at a single location. In the proposed method, the individual clustering solutions are first produced from disjoint datasets at each boosting round and then the cluster prototypes rather than matrices of partitions are transferred to a site to generate a global cluster prototype which is broadcasted to all distributed sites and used to partition data in each site. Finally, all the individual solutions are combined into a weighted voting ensemble on each disjoint data set. Experimental results demonstrate that the proposed distributed clustering method can effectively achieve clustering accuracy comparable to or slightly better than the algorithms in which boosting techniques are applied to the centralized data. In addition, communication cost of the proposed algorithm is very small.
引用
收藏
页码:5952 / 5956
页数:5
相关论文
共 50 条
  • [41] Research on data mining clustering algorithm in cloud computing environment
    Du, Li
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2021, 128 : 179 - 180
  • [42] Fast implementation of dual clustering algorithm for spatial data mining
    Zhou, Jiaogen
    Bian, Fuling
    Guan, Jihong
    Zhang, Meng
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 568 - +
  • [43] LogCluster - A Data Clustering and Pattern Mining Algorithm for Event Logs
    Vaarandi, Risto
    Pihelgas, Mauno
    2015 11TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2015, : 1 - 7
  • [44] A new data mining method based on fusion clustering algorithm
    Wang, TZ
    Tang, TH
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 706 - 711
  • [45] A data clustering algorithm for mining patterns from event logs
    Vaarandi, R
    PROCEEDINGS OF THE 3RD IEEE WORKSHOP ON IP OPERATIONS & MANAGEMENT (IPOM2003), 2003, : 119 - 126
  • [46] Data Mining Using Clustering Algorithm as Tool for Poverty Analysis
    Talingdan, Janelyn A.
    2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE AND COMPUTER APPLICATIONS (ICSCA 2019), 2019, : 56 - 59
  • [47] Adaptation in Clustering Algorithm by Algorithm Output Granularity for Mobile Data Stream Mining
    Wasule, Rahul
    Fadnavis, R. A.
    2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
  • [48] Optimizing Distributed Data Mining applications based on object clustering methods
    Fiolet, V.
    Laskowski, E.
    Olejnik, R.
    Masko, L.
    Toursel, B.
    Tudruj, M.
    PAR ELEC 2006: INTERNATIONAL SYMPOSIUM ON PARALLEL COMPUTING IN ELECTRICAL ENGINEERING, PROCEEDINGS, 2006, : 257 - +
  • [49] Efficient Mining of Association Rules based on Clustering from Distributed Data
    Bouraoui, Marwa
    Touzi, Amel Grissa
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (04) : 401 - 409
  • [50] Distributed Genetic Algorithm to Big Data Clustering A Novel Distributed Encoding Techniques
    Hajeer, Mustafa H.
    Dasgupta, Dipankar
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,