A global two-stage algorithm for non-convex penalized high-dimensional linear regression problems

被引:1
|
作者
Li, Peili [1 ]
Liu, Min [2 ]
Yu, Zhou [1 ]
机构
[1] East China Normal Univ, KLATASDS MOE, Sch Stat, Shanghai 200062, Peoples R China
[2] Wuhan Univ, Sch Math & Stat, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
High-dimensional linear regression; Global convergence; Two-stage algorithm; Primal dual active set with continuation algorithm; Difference of convex functions; COORDINATE DESCENT ALGORITHMS; ACTIVE SET ALGORITHM; VARIABLE SELECTION; LIKELIHOOD;
D O I
10.1007/s00180-022-01249-w
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
By the asymptotic oracle property, non-convex penalties represented by minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD) have attracted much attentions in high-dimensional data analysis, and have been widely used in signal processing, image restoration, matrix estimation, etc. However, in view of their non-convex and non-smooth characteristics, they are computationally challenging. Almost all existing algorithms converge locally, and the proper selection of initial values is crucial. Therefore, in actual operation, they often combine a warm-starting technique to meet the rigid requirement that the initial value must be sufficiently close to the optimal solution of the corresponding problem. In this paper, based on the DC (difference of convex functions) property of MCP and SCAD penalties, we aim to design a global two-stage algorithm for the high-dimensional least squares linear regression problems. A key idea for making the proposed algorithm to be efficient is to use the primal dual active set with continuation (PDASC) method to solve the corresponding sub-problems. Theoretically, we not only prove the global convergence of the proposed algorithm, but also verify that the generated iterative sequence converges to a d-stationary point. In terms of computational performance, the abundant research of simulation and real data show that the algorithm in this paper is superior to the latest SSN method and the classic coordinate descent (CD) algorithm for solving non-convex penalized high-dimensional linear regression problems.
引用
收藏
页码:871 / 898
页数:28
相关论文
共 50 条