Gaussian Mixture Model Clustering with Incomplete Data

被引:34
|
作者
Zhang, Yi [1 ]
Li, Miaomiao [1 ,2 ]
Wang, Siwei [1 ]
Dai, Sisi [1 ]
Luo, Lei [1 ]
Zhu, En [1 ]
Xu, Huiying [3 ,4 ]
Zhu, Xinzhong [3 ]
Yao, Chaoyun [5 ]
Zhou, Haoran [6 ]
机构
[1] NUDT, Sch Comp, Changsha, Peoples R China
[2] Changsha Univ, Changsha, Hunan, Peoples R China
[3] Zhejiang Normal Univ, Coll Math & Comp Sci, Hangzhou, Zhejiang, Peoples R China
[4] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[5] NUDT, Lab Complex Electromagnet Environm Effects Elect, Changsha, Peoples R China
[6] Chongqing Univ Technol, Chongqing, Peoples R China
基金
中国国家自然科学基金;
关键词
GMM; clustering; EM; incomplete data;
D O I
10.1145/3408318
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Gaussian mixturemodel (GMM) clustering has been extensively studied due to its effectiveness and efficiency. Though demonstrating promising performance in various applications, it cannot effectively address the absent features among data, which is not uncommon in practical applications. In this article, different from existing approaches that first impute the absence and then perform GMM clustering tasks on the imputed data, we propose to integrate the imputation and GMM clustering into a unified learning procedure. Specifically, the missing data is filled by the result of GMM clustering, and the imputed data is then taken for GMM clustering. These two steps alternatively negotiate with each other to achieve optimum. By this way, the imputed data can best serve for GMM clustering. A two-step alternative algorithm with proved convergence is carefully designed to solve the resultant optimization problem. Extensive experiments have been conducted on eight UCI benchmark datasets, and the results have validated the effectiveness of the proposed algorithm.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Multivariate data clustering for the Gaussian mixture model
    Kavaliauskas, M
    Rudzkis, R
    INFORMATICA, 2005, 16 (01) : 61 - 74
  • [2] Laplacian Regularized Gaussian Mixture Model for Data Clustering
    He, Xiaofei
    Cai, Deng
    Shao, Yuanlong
    Bao, Hujun
    Han, Jiawei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (09) : 1406 - 1418
  • [3] Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
    Huang, Huajuan
    Liao, Zepeng
    Wei, Xiuxi
    Zhou, Yongquan
    ENTROPY, 2023, 25 (06)
  • [4] A latent-class mixture model for incomplete longitudinal Gaussian data
    Beunckens, Caroline
    Molenberghs, Geert
    Verbeke, Geert
    Mallinckrodt, Craig
    BIOMETRICS, 2008, 64 (01) : 96 - 105
  • [5] Fitting Gaussian mixture models on incomplete data
    Zachary R. McCaw
    Hugues Aschard
    Hanna Julienne
    BMC Bioinformatics, 23
  • [6] Fitting Gaussian mixture models on incomplete data
    McCaw, Zachary R.
    Aschard, Hugues
    Julienne, Hanna
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [7] A Deep Fusion Gaussian Mixture Model for Multiview Land Data Clustering
    Li, Peng
    Chen, Zhikui
    Gao, Jing
    Zhang, Jianing
    Jin, Shan
    Zhao, Wenhan
    Xia, Feng
    Wang, Lu
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2020, 2020 (2020):
  • [8] Gaussian mixture clustering and imputation of microarray data
    Ouyang, M
    Welsh, WJ
    Georgopoulos, P
    BIOINFORMATICS, 2004, 20 (06) : 917 - 923
  • [9] OPTIMALITY OF SPECTRAL CLUSTERING IN THE GAUSSIAN MIXTURE MODEL
    Loeffler, Matthias
    Zhang, Anderson Y.
    Zhou, Harrison H.
    ANNALS OF STATISTICS, 2021, 49 (05): : 2506 - 2530
  • [10] Transfer Clustering Based on Gaussian Mixture Model
    Wang, Rongrong
    Zhou, Jin
    Liu, Xiangdao
    Han, Shiyuan
    Wang, Lin
    Chen, Yuehui
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 2522 - 2526