Bayesian mixture models (in)consistency for the number of clusters

被引:1
|
作者
Alamichel, Louise [1 ]
Bystrova, Daria [1 ,2 ]
Arbel, Julyan [1 ]
King, Guillaume Kon Kam [3 ]
机构
[1] Univ Grenoble Alpes, Inria, Grenoble INP, LJK,CNRS, Grenoble, France
[2] Univ Savoie Mont Blanc, CNRS, Lab Ecol Alpine, Univ Grenoble Alpes, Grenoble, France
[3] Univ Paris Saclay, INRAE, MaIAGE, Jouy En Josas, France
关键词
clustering; finite mixtures; finite-dimensional BNP representations; Gibbs-type process; GIBBS-TYPE PRIORS; PITMAN-YOR; NONPARAMETRIC-INFERENCE; DIRICHLET MIXTURES; DENSITY-ESTIMATION; CONVERGENCE-RATES; FINITE; CONSISTENCY;
D O I
10.1111/sjos.12739
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, recent results proved posterior inconsistency of the number of clusters when the true number of components is finite, for the Dirichlet process and Pitman-Yor process mixture models. We extend these results to additional Bayesian nonparametric priors such as Gibbs-type processes and finite-dimensional representations thereof. The latter include the Dirichlet multinomial process, the recently proposed Pitman-Yor, and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a postprocessing algorithm introduced for the Dirichlet process can be extended to more general models and provides a consistent method to estimate the number of components.
引用
收藏
页码:1619 / 1660
页数:42
相关论文
共 50 条
  • [21] Consensus clustering for Bayesian mixture models
    Stephen Coleman
    Paul D. W. Kirk
    Chris Wallace
    BMC Bioinformatics, 23
  • [22] Distributed data mining on clusters with Bayesian mixture modeling
    Viswanathan, M
    Yang, YK
    Whangbo, TK
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 1, PROCEEDINGS, 2005, 3613 : 1207 - 1216
  • [23] Consistency of Bayesian linear model selection with a growing number of parameters
    Shang, Zuofeng
    Clayton, Murray K.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2011, 141 (11) : 3463 - 3474
  • [24] Species sampling models: consistency for the number of species
    Bissiri, P. G.
    Ongaro, A.
    Walker, S. G.
    BIOMETRIKA, 2013, 100 (03) : 771 - 777
  • [25] Mixture Models With a Prior on the Number of Components
    Miller, Jeffrey W.
    Harrison, Matthew T.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (521) : 340 - 356
  • [26] An entropy criterion for assessing the number of clusters in a mixture model
    Celeux, G
    Soromenho, G
    JOURNAL OF CLASSIFICATION, 1996, 13 (02) : 195 - 212
  • [27] On the consistency of MLE in finite mixture models of exponential families
    Atienza, N.
    Garcia-Heras, J.
    Munoz-Pichardo, J. M.
    Villa-Caro, R.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2007, 137 (02) : 496 - 505
  • [28] On Posterior Consistency of Bayesian Factor Models in High Dimensions*
    Ma, Yucong
    Liu, Jun S.
    BAYESIAN ANALYSIS, 2022, 17 (03): : 901 - 929
  • [29] Approximate Bayesian inference for mixture cure models
    E. Lázaro
    C. Armero
    V. Gómez-Rubio
    TEST, 2020, 29 : 750 - 767
  • [30] Approximate Bayesian inference for mixture cure models
    Lazaro, E.
    Armero, C.
    Gomez-Rubio, V
    TEST, 2020, 29 (03) : 750 - 767