Cross-Validation With Confidence

被引:47
|
作者
Lei, Jing [1 ]
机构
[1] Carnegie Mellon Univ, Dept Stat & Data Sci, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
关键词
Cross-validation; Hypothesis testing; Model selection; Overfitting; Tuning parameter selection; TUNING PARAMETER SELECTION; MODEL SELECTION; LASSO;
D O I
10.1080/01621459.2019.1672556
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Cross-validation is one of the most popular model and tuning parameter selection methods in statistics and machine learning. Despite its wide applicability, traditional cross-validation methods tend to overfit, due to the ignorance of the uncertainty in the testing sample. We develop a novel statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This method outputs a set of highly competitive candidate models containing the optimal one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for tuning parameter selection, the method can provide an alternative trade-off between prediction accuracy and model interpretability than existing variants of cross-validation. We demonstrate the performance of the proposed method in several simulated and real data examples. Supplemental materials for this article can be found online.
引用
收藏
页码:1978 / 1997
页数:20
相关论文
共 50 条
  • [1] Confidence intervals for the Cox model test error from cross-validation
    Sun, Min Woo
    Tibshirani, Robert
    STATISTICS IN MEDICINE, 2023, 42 (25) : 4532 - 4541
  • [2] INCREASING CONFIDENCE IN SOCIAL-SCIENCE RESEARCH FINDINGS VIA CROSS-VALIDATION
    JANEKSELA, GM
    INTERNATIONAL REVIEW OF MODERN SOCIOLOGY, 1982, 12 (01): : 67 - 75
  • [3] Fast Cross-Validation
    Liu, Yong
    Lin, Hailun
    Ding, Lizhong
    Wang, Weiping
    Liao, Shizhong
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2497 - 2503
  • [4] Cross-validation Revisited
    Dutta, Santanu
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2016, 45 (02) : 472 - 490
  • [5] Multifidelity Cross-validation
    Renganathan, Ashwin
    Carlson, Kade
    AIAA AVIATION FORUM AND ASCEND 2024, 2024,
  • [6] Targeted cross-validation
    Zhang, Jiawei
    Ding, Jie
    Yang, Yuhong
    BERNOULLI, 2023, 29 (01) : 377 - 402
  • [7] SMOOTHED CROSS-VALIDATION
    HALL, P
    MARRON, JS
    PARK, BU
    PROBABILITY THEORY AND RELATED FIELDS, 1992, 92 (01) : 1 - 20
  • [8] PARAMETERS OF CROSS-VALIDATION
    HERZBERG, PA
    PSYCHOMETRIKA, 1969, 34 (2P2) : 1 - &
  • [9] CROSS-VALIDATION FOR PREDICTION
    COOIL, B
    WINER, RS
    RADOS, DL
    JOURNAL OF MARKETING RESEARCH, 1987, 24 (03) : 271 - 279
  • [10] Cross-validation methods
    Browne, MW
    JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2000, 44 (01) : 108 - 132