Consensus Big Data Clustering for Bayesian Mixture Models

被引:3
|
作者
Karras, Christos [1 ]
Karras, Aristeidis [1 ]
Giotopoulos, Konstantinos C. [2 ]
Avlonitis, Markos [3 ]
Sioutas, Spyros [1 ]
机构
[1] Univ Patras, Comp Engn & Informat Dept, Patras 26504, Greece
[2] Univ Patras, Dept Management Sci & Technol, Patras 26334, Greece
[3] Ionian Univ, Dept Informat, Kerkira 49100, Greece
关键词
stochastic data engineering; cluster analysis; Bayesian mixture modelling; consensus clustering; big-data management and analytics; NUMBER;
D O I
10.3390/a16050245
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of big-data analysis, the clustering technique holds significant importance for the effective categorization and organization of extensive datasets. However, pinpointing the ideal number of clusters and handling high-dimensional data can be challenging. To tackle these issues, several strategies have been suggested, such as a consensus clustering ensemble that yields more significant outcomes compared to individual models. Another valuable technique for cluster analysis is Bayesian mixture modelling, which is known for its adaptability in determining cluster numbers. Traditional inference methods such as Markov chain Monte Carlo may be computationally demanding and limit the exploration of the posterior distribution. In this work, we introduce an innovative approach that combines consensus clustering and Bayesian mixture models to improve big-data management and simplify the process of identifying the optimal number of clusters in diverse real-world scenarios. By addressing the aforementioned hurdles and boosting accuracy and efficiency, our method considerably enhances cluster analysis. This fusion of techniques offers a powerful tool for managing and examining large and intricate datasets, with possible applications across various industries.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] The full Bayesian significance test for mixture models: results in gene expression clustering
    Lauretto, M. S.
    Pereira, C. A. B.
    Stern, J. M.
    GENETICS AND MOLECULAR RESEARCH, 2008, 7 (03): : 883 - 897
  • [42] On Bayesian Clustering with a Structured Gaussian Mixture
    Yamazaki, Keisuke
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2014, 18 (06) : 1007 - 1012
  • [43] Bayesian estimation and classification with incomplete data using mixture models
    Zhang, JF
    Everson, R
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA'04), 2004, : 296 - 303
  • [44] Bayesian Plackett–Luce Mixture Models for Partially Ranked Data
    Cristina Mollica
    Luca Tardella
    Psychometrika, 2017, 82 : 442 - 458
  • [45] Clustering minimal inhibitory concentration data through Bayesian mixture models: An application to detect Mycobacterium tuberculosis resistance mutations
    Grazian, Clara
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2023, 32 (12) : 2423 - 2439
  • [46] Sampling in Dirichlet Process Mixture Models for Clustering Streaming Data
    Dinari, Or
    Freifeld, Oren
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 818 - 835
  • [47] Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering
    Li C.
    Rana S.
    Phung D.
    Venkatesh S.
    Annals of Data Science, 2016, 3 (2) : 205 - 223
  • [48] Clustering Spatial Data via Mixture Models with Dynamic Weights
    Same, Allou
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2020, 12237 : 128 - 138
  • [49] ANALYZE VISUAL MODELS FOR ASSESSMENT OF BIG DATA CLUSTERING RESULTS
    Bhuvaneswari, A. P.
    Bindu, C. Shoba
    Sam, R. Praveen
    JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, : 276 - 286
  • [50] A Bayesian sparse finite mixture model for clustering data from a heterogeneous population
    Saraiva, Erlandson F.
    Suzuki, Adriano K.
    Milan, Luis A.
    BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS, 2020, 34 (02) : 323 - 344