Gaussian Mixture Model Clustering with Incomplete Data

被引:34
|
作者
Zhang, Yi [1 ]
Li, Miaomiao [1 ,2 ]
Wang, Siwei [1 ]
Dai, Sisi [1 ]
Luo, Lei [1 ]
Zhu, En [1 ]
Xu, Huiying [3 ,4 ]
Zhu, Xinzhong [3 ]
Yao, Chaoyun [5 ]
Zhou, Haoran [6 ]
机构
[1] NUDT, Sch Comp, Changsha, Peoples R China
[2] Changsha Univ, Changsha, Hunan, Peoples R China
[3] Zhejiang Normal Univ, Coll Math & Comp Sci, Hangzhou, Zhejiang, Peoples R China
[4] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[5] NUDT, Lab Complex Electromagnet Environm Effects Elect, Changsha, Peoples R China
[6] Chongqing Univ Technol, Chongqing, Peoples R China
基金
中国国家自然科学基金;
关键词
GMM; clustering; EM; incomplete data;
D O I
10.1145/3408318
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Gaussian mixturemodel (GMM) clustering has been extensively studied due to its effectiveness and efficiency. Though demonstrating promising performance in various applications, it cannot effectively address the absent features among data, which is not uncommon in practical applications. In this article, different from existing approaches that first impute the absence and then perform GMM clustering tasks on the imputed data, we propose to integrate the imputation and GMM clustering into a unified learning procedure. Specifically, the missing data is filled by the result of GMM clustering, and the imputed data is then taken for GMM clustering. These two steps alternatively negotiate with each other to achieve optimum. By this way, the imputed data can best serve for GMM clustering. A two-step alternative algorithm with proved convergence is carefully designed to solve the resultant optimization problem. Extensive experiments have been conducted on eight UCI benchmark datasets, and the results have validated the effectiveness of the proposed algorithm.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data
    Dai, Xiaofeng
    Erkkila, Timo
    Yli-Harja, Olli
    Lahdesmaki, Harri
    BMC BIOINFORMATICS, 2009, 10
  • [22] Mixed Deep Gaussian Mixture Model: a clustering model for mixed datasets
    Robin Fuchs
    Denys Pommeret
    Cinzia Viroli
    Advances in Data Analysis and Classification, 2022, 16 : 31 - 53
  • [23] A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data
    Xiaofeng Dai
    Timo Erkkilä
    Olli Yli-Harja
    Harri Lähdesmäki
    BMC Bioinformatics, 10
  • [24] SWGMM: a semi-wrapped Gaussian mixture model for clustering of circular-linear data
    Roy, Anandarup
    Parui, Swapan K.
    Roy, Utpal
    PATTERN ANALYSIS AND APPLICATIONS, 2016, 19 (03) : 631 - 645
  • [25] Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions
    Yao, Dapeng
    Xie, Fangzheng
    Xu, Yanxun
    JOURNAL OF MACHINE LEARNING RESEARCH, 2025, 26 : 1 - 50
  • [26] An improved clustering algorithm based on finite Gaussian mixture model
    He, Zhilin
    Ho, Chun-Hsing
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (17) : 24285 - 24299
  • [27] Regularized Gaussian Mixture Model for High-Dimensional Clustering
    Zhao, Yang
    Shrivastava, Abhishek K.
    Tsui, Kwok Leung
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (10) : 3677 - 3688
  • [28] CLUSTERING OF CHILDHOOD DIARRHEA DISEASES USING GAUSSIAN MIXTURE MODEL
    Faidah, Defi yusti
    Hudzaifa, Ashilla maula
    Pontoh, Resa septiani
    COMMUNICATIONS IN MATHEMATICAL BIOLOGY AND NEUROSCIENCE, 2024,
  • [29] CONTRAST OF GAUSSIAN MIXTURE MODEL AND CLUSTERING ALGORITHM FOR SINGER IDENTIFICATION
    Dharini, D.
    Revathy, A.
    Kalaivani, M.
    2018 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2018,
  • [30] An improved clustering algorithm based on finite Gaussian mixture model
    Zhilin He
    Chun-Hsing Ho
    Multimedia Tools and Applications, 2019, 78 : 24285 - 24299