An evaluation of the bootstrap for model validation in mixture models

被引：14

作者：

Jaki, Thomas ^{[1
]}

Su, Ting-Li ^{[2
]}

Kim, Minjung ^{[3
]}

Van Horn, M. Lee ^{[4
]}

机构：

[1] Univ Lancaster, Dept Math & Stat, Lancaster LA1 4YF, England

[2] Univ Manchester, Div Dent, Manchester, Lancs, England

[3] Univ Alabama, Dept Psychol, Box 870348, Tuscaloosa, AL 35487 USA

[4] Univ New Mexico, Coll Educ, Albuquerque, NM 87131 USA

来源：

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION | 2018年 / 47卷 / 04期

关键词：

Finite mixture models; Leave-k-out cross-validation; Model validation; Nonparametric Bootstrap; Regression mixture models; FINITE MIXTURES; BAYESIAN-INFERENCE; COMPONENTS; NUMBER;

D O I：

10.1080/03610918.2017.1303726

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Bootstrapping has been used as a diagnostic tool for validating model results for a wide array of statistical models. Here we evaluate the use of the non-parametric bootstrap for model validation in mixture models. We show that the bootstrap is problematic for validating the results of class enumeration and demonstrating the stability of parameter estimates in both finite mixture and regression mixture models. In only 44% of simulations did bootstrapping detect the correct number of classes in at least 90% of the bootstrap samples for a finite mixture model without any model violations. For regression mixture models and cases with violated model assumptions, the performance was even worse. Consequently, we cannot recommend the non-parametric bootstrap for validating mixture models.The cause of the problem is that when resampling is used influential individual observations have a high likelihood of being sampled many times. The presence of multiple replications of even moderately extreme observations is shown to lead to additional latent classes being extracted. To verify that these replications cause the problems we show that leave-k-out cross-validation where sub-samples taken without replacement does not suffer from the same problem.

引用

页码：1028 / 1038

页数：11

共 50 条

[41] The difference of model robustness assessment using cross-validation and bootstrap methods
Lasfar, Rita
Toth, Gergely
JOURNAL OF CHEMOMETRICS, 2024, 38 (06)
[42] On model selection and concavity for finite mixture models
Cadez, IV
Smyth, P
2000 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2000, : 323 - 323
[43] Model Selection for Multilevel Mixture Rasch Models
Sen, Sedat
Cohen, Allan S.
Kim, Seock-Ho
APPLIED PSYCHOLOGICAL MEASUREMENT, 2019, 43 (04) : 272 - 289
[44] Model selection for the localized mixture of experts models
Jiang, Yunlu
Yu Conglian
Ji Qinghua
JOURNAL OF APPLIED STATISTICS, 2018, 45 (11) : 1994 - 2006
[45] Lifetime evaluation model of small sample based on Bootstrap theory
Zhao, Yuan
Yang, Lin
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2021, 48 (01): : 106 - 112
[46] Model-based INAR bootstrap for forecasting INAR(p) models
Luisa Bisaglia
Margherita Gerolimetto
Computational Statistics, 2019, 34 : 1815 - 1848
[47] Model-based INAR bootstrap for forecasting INAR(p) models
Bisaglia, Luisa
Gerolimetto, Margherita
COMPUTATIONAL STATISTICS, 2019, 34 (04) : 1815 - 1848
[48] A BOOTSTRAP METHOD OF DISTRIBUTION MIXTURE PROPORTION DETERMINATION
CHILINGARIAN, AA
ZAZIAN, GZ
PATTERN RECOGNITION LETTERS, 1990, 11 (12) : 781 - 785
[49] Introduction of the bootstrap resampling in the generalized mixture estimation
Bougarradh, Ahlem
M'hiri, Slim
Ghorbel, Faouzi
2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 993 - 998
[50] VALIDATION OF VOID JUDGMENTS - BOOTSTRAP PRINCIPLE .2. SCOPE OF BOOTSTRAP
DOBBS, DB
VIRGINIA LAW REVIEW, 1967, 53 (06) : 1241 - 1265

← 1 2 3 4 5 →