Cross validation in LASSO and its acceleration

被引:50
作者
Obuchi, Tomoyuki [1 ]
Kabashima, Yoshiyuki [1 ]
机构
[1] Tokyo Inst Technol, Dept Math & Comp Sci, Yokohama, Kanagawa 2268502, Japan
关键词
message-passing algorithms; learning theory; statistical inference; STATISTICAL-MECHANICS; SELECTION; SHRINKAGE; MAXIMUM; IMPROVE;
D O I
10.1088/1742-5468/2016/05/053304
中图分类号
O3 [力学];
学科分类号
08 ; 0801 ;
摘要
We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the message passing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which enable us to significantly reduce the necessary cost of computation. These formulas also provide a simple connection of the CV errors to the residual sums of squares between the reconstructed and the given measurements. Second, on the basis of this finding, we analytically evaluate the CV errors when the design matrix is given as a simple random matrix in the large size limit by using the replica method. Finally, these results are compared with those of numerical simulations on finite-size systems and are confirmed to be correct. We also apply the simple formulas of the first type of CV error to an actual dataset of the supernovae.
引用
收藏
页数:36
相关论文
共 34 条
[1]  
[Anonymous], 2001, NEURAL INFORM PROCES
[2]  
[Anonymous], ADV NEURAL INFORM PR
[3]  
[Anonymous], ARXIV14113230V2
[4]  
[Anonymous], 2019, Statistical learning with sparsity: the lasso and generalizations
[5]  
[Anonymous], 2010, ARXIV10105141
[6]  
Breiman F, 1984, OLSHEN STONE CLASSIF
[7]   Robust uncertainty principles:: Exact signal reconstruction from highly incomplete frequency information [J].
Candès, EJ ;
Romberg, J ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (02) :489-509
[8]   Decoding by linear programming [J].
Candes, EJ ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) :4203-4215
[9]   Near-optimal signal recovery from random projections: Universal encoding strategies? [J].
Candes, Emmanuel J. ;
Tao, Terence .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (12) :5406-5425
[10]   Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing [J].
Donoho, David ;
Tanner, Jared .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1906) :4273-4293