Comparison of Criteria for Choosing the Number of Classes in Bayesian Finite Mixture Models

被引:76
|
作者
Nasserinejad, Kazem [1 ,2 ]
van Rosmalen, Joost [1 ]
de Kort, Wim [3 ,4 ]
Lesaffre, Emmanuel [5 ]
机构
[1] Erasmus MC, Dept Biostat, Rotterdam, Netherlands
[2] Erasmus MC, Dept Hematol, Clin Trial Ctr, Inst Canc, Rotterdam, Netherlands
[3] Sanquin Res, Dept Donor Studies, Amsterdam, Netherlands
[4] Acad Med Ctr, Dept Publ Hlth, Amsterdam, Netherlands
[5] Katholieke Univ Leuven, Biostat L, Leuven, Belgium
来源
PLOS ONE | 2017年 / 12卷 / 01期
关键词
UNKNOWN NUMBER; BLOOD-DONORS; WHOLE-BLOOD; HEMOGLOBIN LEVELS; IRON-DEFICIENCY; DISTRIBUTIONS;
D O I
10.1371/journal.pone.0168838
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identifying the number of classes in Bayesian finite mixture models is a challenging problem. Several criteria have been proposed, such as adaptations of the deviance information criterion, marginal likelihoods, Bayes factors, and reversible jump MCMC techniques. It was recently shown that in overfitted mixture models, the overfitted latent classes will asymptotically become empty under specific conditions for the prior of the class proportions. This result may be used to construct a criterion for finding the true number of latent classes, based on the removal of latent classes that have negligible proportions. Unlike some alternative criteria, this criterion can easily be implemented in complex statistical models such as latent class mixed-effects models and multivariate mixture models using standard Bayesian software. We performed an extensive simulation study to develop practical guidelines to determine the appropriate number of latent classes based on the posterior distribution of the class proportions, and to compare this criterion with alternative criteria. The performance of the proposed criterion is illustrated using a data set of repeatedly measured hemoglobin values of blood donors.
引用
收藏
页数:23
相关论文
共 50 条