Cross-Validation With Confidence

被引:47
|
作者
Lei, Jing [1 ]
机构
[1] Carnegie Mellon Univ, Dept Stat & Data Sci, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
关键词
Cross-validation; Hypothesis testing; Model selection; Overfitting; Tuning parameter selection; TUNING PARAMETER SELECTION; MODEL SELECTION; LASSO;
D O I
10.1080/01621459.2019.1672556
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Cross-validation is one of the most popular model and tuning parameter selection methods in statistics and machine learning. Despite its wide applicability, traditional cross-validation methods tend to overfit, due to the ignorance of the uncertainty in the testing sample. We develop a novel statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This method outputs a set of highly competitive candidate models containing the optimal one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for tuning parameter selection, the method can provide an alternative trade-off between prediction accuracy and model interpretability than existing variants of cross-validation. We demonstrate the performance of the proposed method in several simulated and real data examples. Supplemental materials for this article can be found online.
引用
收藏
页码:1978 / 1997
页数:20
相关论文
共 50 条
  • [31] CROSS-VALIDATION OF GORDON SIV
    MORRIS, BB
    PERCEPTUAL AND MOTOR SKILLS, 1968, 27 (01) : 44 - &
  • [32] Far Casting Cross-Validation
    Carmack, Patrick S.
    Schucany, William R.
    Spence, Jeffrey S.
    Gunst, Richard F.
    Lin, Qihua
    Haley, Robert W.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2009, 18 (04) : 879 - 893
  • [33] CROSS-VALIDATION OF PRIVACY FACTORS
    PEDERSEN, DM
    PERCEPTUAL AND MOTOR SKILLS, 1982, 55 (01) : 57 - 58
  • [34] Median cross-validation criterion
    YANG YingDepartment of Applied Mathematics
    Chinese Science Bulletin, 1997, (23) : 1956 - 1959
  • [35] On cross-validation of Bayesian models
    Alqallaf, F
    Gustafson, P
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2001, 29 (02): : 333 - 340
  • [36] Graph Fission and Cross-Validation
    Leiner, James
    Ramdas, Aaditya
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [37] CROSS-VALIDATION IN STEPWISE REGRESSION
    SALAHUDDIN
    HAWKES, AG
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1991, 20 (04) : 1163 - 1182
  • [38] CROSS-VALIDATION IN SURVIVAL ANALYSIS
    VERWEIJ, PJM
    VANHOUWELINGEN, HC
    STATISTICS IN MEDICINE, 1993, 12 (24) : 2305 - 2314
  • [39] CROSS-VALIDATION OF AN OBJECTIVE RORSCHACH
    BOYD, RW
    JOURNAL OF CLINICAL PSYCHOLOGY, 1963, 19 (03) : 322 - 323
  • [40] A CROSS-VALIDATION OF AN OBJECTIVE RORSCHACH
    VERMA, SK
    KUMAR, P
    INDIAN JOURNAL OF PSYCHOLOGY, 1966, 41 : 45 - &