Weighted consensus clustering and its application to Big data

被引:17
|
作者
Alguliyev, Rasim M. [1 ]
Aliguliyev, Ramiz M. [1 ]
Sukhostat, Lyudmila, V [1 ]
机构
[1] Azerbaijan Natl Acad Sci, Inst Informat Technol, 9A B Vahabzade St, AZ-1141 Baku, Azerbaijan
关键词
Weighted consensus clustering; Big data; Utility function; Purity-based utility function; Co-association matrix; ENSEMBLE; ALGORITHM; INDEXES;
D O I
10.1016/j.eswa.2020.113294
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this study is the development of a weighted consensus clustering that assigns weights to single clustering methods using the purity utility function. In the case of Big data that does not contain labels, the utility function based on the Davies-Bouldin index is proposed in this paper. The Banknote authentication, Phishing, Diabetic, Magic04, Credit card clients, Covertype, Phone accelerometer, and NSL-KDD datasets are used to assess the efficiency of the proposed consensus approach. The proposed approach is evaluated using the Euclidean, Minkowski, squared Euclidean, cosine, and Chebychev distance metrics. It is compared with single clustering algorithms (DBSCAN, OPTICS, CLARANS, k-means, and shared nearby neighbor clustering). The experimental results show the effectiveness of the proposed approach to the Big data clustering in comparison to single clustering methods. The proposed weighted consensus clustering using the squared Euclidean distance metric achieves the highest accuracy, which is a very promising result for Big data clustering. It can be applied to expert systems to help experts make group decisions based on several alternatives. The paper also provides directions for future research on consensus clustering in this area. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A novel clustering algorithm based on weighted support and its application
    Yang, XR
    Shen, JY
    Liu, Q
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 95 - 100
  • [22] Application of Big Data Clustering Algorithm in Electrical Engineering Automation
    Zhang, Yongchang
    Zhang, Zhe
    JOURNAL OF APPLIED MATHEMATICS, 2022, 2022
  • [23] Big Data Application and its Impact on Education
    Khan, Shakir
    Alqahtani, Salihah
    INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2020, 15 (17) : 36 - 46
  • [24] Speeding up the large-scale consensus fuzzy clustering for handling Big Data
    Sassi Hidri, Minyar
    Zoghlami, Mohamed Ali
    Ben Ayed, Rahma
    FUZZY SETS AND SYSTEMS, 2018, 348 : 50 - 74
  • [25] Clustering Algorithm and Its Application in Data Mining
    Zou, Hailei
    WIRELESS PERSONAL COMMUNICATIONS, 2020, 110 (01) : 21 - 30
  • [26] Clustering Algorithm and Its Application in Data Mining
    Hailei Zou
    Wireless Personal Communications, 2020, 110 : 21 - 30
  • [27] Secure weighted possibilistic c-means algorithm on cloud for clustering big data
    Zhang, Qingchen
    Yang, Laurence T.
    Castiglione, Arcangelo
    Chen, Zhikui
    Li, Peng
    INFORMATION SCIENCES, 2019, 479 : 515 - 525
  • [28] A Weighted Fuzzy c-Means Clustering Algorithm for Incomplete Big Sensor Data
    Li, Peng
    Chen, Zhikui
    Hu, Yueming
    Leng, Yonglin
    Li, Qiucen
    WIRELESS SENSOR NETWORKS (CWSN 2017), 2018, 812 : 55 - 63
  • [29] CONSENSUS CLUSTERING ON DATA FRAGMENTS
    Sukhanov, S.
    Gupta, V.
    Debes, C.
    Zoubir, A. M.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4631 - 4635
  • [30] A SPATIAL OUTLIER DETECTION METHOD FOR BIG DATA BASED ON ADJACENCY WEIGHTED RESIDUALS AND ITS APPLICATION TO COVID-19 DATA
    Baba, Ali Mohammed
    Midi, Habshah
    Abd Rahman, Nur Haizum
    ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH, 2021, 55 (03): : 87 - 102