Model selection with bootstrap validation

被引:1
|
作者
Savvides, Rafael [1 ]
Makela, Jarmo [1 ]
Puolamaki, Kai [1 ,2 ]
机构
[1] Univ Helsinki, Dept Comp Sci, Helsinki, Finland
[2] Univ Helsinki, Inst Atmospher & Earth Syst Res, Helsinki, Finland
基金
芬兰科学院;
关键词
bootstrap; model selection; CROSS-VALIDATION;
D O I
10.1002/sam.11606
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model selection is one of the most central tasks in supervised learning. Validation set methods are the standard way to accomplish this task: models are trained on training data, and the model with the smallest loss on the validation data is selected. However, it is generally not obvious how much validation data is required to make a reliable selection, which is essential when labeled data are scarce or expensive. We propose a bootstrap-based algorithm, bootstrap validation (BSV), that uses the bootstrap to adjust the validation set size and to find the best-performing model within a tolerance parameter specified by the user. We find that BSV works well in practice and can be used as a drop-in replacement for validation set methods or k-fold cross-validation. The main advantage of BSV is that less validation data is typically needed, so more data can be used to train the model, resulting in better approximations and efficient use of validation data.
引用
收藏
页码:162 / 186
页数:25
相关论文
共 50 条
  • [31] Bootstrap confidence intervals for reservoir model selection techniques
    Scheidt, Celine
    Caers, Jef
    COMPUTATIONAL GEOSCIENCES, 2010, 14 (02) : 369 - 382
  • [32] Robust model selection using fast and robust bootstrap
    Salibian-Barrera, Matlas
    Van Aelst, Stefan
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (12) : 5121 - 5135
  • [33] UNIFORM ASYMPTOTIC INFERENCE AND THE BOOTSTRAP AFTER MODEL SELECTION
    Tibshirani, Ryan J.
    Rinaldo, Alessandro
    Tibshirani, Rob
    Wasserman, Larry
    ANNALS OF STATISTICS, 2018, 46 (03): : 1255 - 1287
  • [34] Bootstrap confidence intervals for reservoir model selection techniques
    Céline Scheidt
    Jef Caers
    Computational Geosciences, 2010, 14 : 369 - 382
  • [35] Bootstrap model selection for possibly dependent and heterogeneous data
    Sancetta, Alessio
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2010, 62 (03) : 515 - 546
  • [36] Bootstrap model selection for possibly dependent and heterogeneous data
    Alessio Sancetta
    Annals of the Institute of Statistical Mathematics, 2010, 62 : 515 - 546
  • [37] Bootstrap for inference after model selection and model averaging for likelihood models
    Garcia-Angulo, Andrea C.
    Claeskens, Gerda
    METRIKA, 2024, 88 (3) : 311 - 340
  • [38] Model Selection by Predictive Validation
    Josef Kittler
    Kieron Messer
    Mohammad Sadeghi
    Pattern Analysis & Applications, 2002, 5 : 245 - 260
  • [39] Model selection by predictive validation
    Kittler, J
    Messer, K
    Sadeghi, M
    PATTERN ANALYSIS AND APPLICATIONS, 2002, 5 (03) : 245 - 260
  • [40] On cross validation for model selection
    Rivals, I
    Personnaz, L
    NEURAL COMPUTATION, 1999, 11 (04) : 863 - 870