Optimal ratio for data splitting

被引:300
|
作者
Joseph, V. Roshan [1 ]
机构
[1] Georgia Inst Technol, H Milton Stewart Sch Ind & Syst Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
testing; training; validation; CALIBRATION; VALIDATION; MODELS;
D O I
10.1002/sam.11583
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is common to split a dataset into training and testing sets before fitting a statistical or machine learning model. However, there is no clear guidance on how much data should be used for training and testing. In this article, we show that the optimal training/testing splitting ratio is root p : 1, where p is the number of parameters in a linear regression model that explains the data well.
引用
收藏
页码:531 / 538
页数:8
相关论文
共 50 条
  • [31] The Splitting Game: Value and Optimal Strategies
    Oliu-Barton, Miquel
    DYNAMIC GAMES AND APPLICATIONS, 2018, 8 (01) : 157 - 179
  • [32] Intelligent data splitting for volume data
    Shen, Hong
    Bartsch, Ernst
    MEDICAL IMAGING 2006: IMAGE PROCESSING, PTS 1-3, 2006, 6144
  • [33] Optimal Frame-Splitting Ratio Achieving Lowest BER over AWGN and Rayleigh Fading Channels with Distributed Antennas
    Yasutake, Makoto
    Cheng, Jun
    Sun, Chen
    Watanabe, Yoichiro
    2008 INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY AND ITS APPLICATIONS, VOLS 1-3, 2008, : 1326 - +
  • [34] Global F-splitting ratio of modules
    De Stefani, Alessandro
    Polstra, Thomas
    Yao, Yongwei
    JOURNAL OF ALGEBRA, 2022, 610 : 773 - 792
  • [35] Beamsplitter has widely adjustable splitting ratio
    Wallace, John
    LASER FOCUS WORLD, 2011, 47 (11): : 16 - +
  • [36] Optimal pilot-to-data power ratio for diversity combining with imperfect channel estimation
    Peng, Y
    Cui, SS
    You, R
    IEEE COMMUNICATIONS LETTERS, 2006, 10 (02) : 97 - 99
  • [37] MISE-OPTIMAL GROUPING OF POINT-PROCESS DATA WITH A CONSTANT DISPERSION RATIO
    Chen, Huifen
    Schmeiser, Bruce
    2018 WINTER SIMULATION CONFERENCE (WSC), 2018, : 1563 - 1574
  • [38] Service Ratio-Optimal, Content Coherence-Aware Data Push Systems
    Liaskos, Christos
    Tsioliaridou, Ageliki
    ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2016, 6 (04)
  • [39] Optimal Flow Splitting for Multi-Path Multi-Interface Wireless Data Streaming Networks
    Dilawari, A.
    Tahir, Muhammad
    2013 IEEE 24TH INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2013, : 1878 - 1882
  • [40] Optimal Power Allocation and Power Splitting Ratio Assignments for SWIPT-Enabled Orthogonal Multiple Access with Distributed Antenna Systems
    Kim, Dongjae
    Choi, Minseok
    Seo, Dong-Wook
    ELECTRONICS, 2023, 12 (09)