HIGH-DIMENSIONAL LINEAR REGRESSION WITH HARD THRESHOLDING REGULARIZATION: THEORY AND ALGORITHM

被引:0
|
作者
Kang, Lican [1 ,2 ]
Lai, Yanming [1 ]
Liu, Yanyan [1 ]
Luo, Yuan [1 ]
Zhang, Jing [3 ]
机构
[1] Wuhan Univ, Sch Math & Stat, Wuhan 430072, Hubei, Peoples R China
[2] Duke NUS Med Sch, Ctr Quantitat Med, Singapore 169857, Singapore
[3] Zhongnan Univ Econ & Law, Sch Math & Stat, Wuhan 430073, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
  Generalized Newton method; hard thresholding regularization; high-dimensional; linear regression model; primal dual active set algorithm; NONCONVEX PENALIZED REGRESSION; ACTIVE SET ALGORITHM; VARIABLE SELECTION; ELASTIC-NET; SPARSITY; LASSO;
D O I
10.3934/jimo.2022034
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Variable selection and parameter estimation are fundamental and important problems in high dimensional data analysis. In this paper, we employ the hard thresholding regularization method [1] to handle these issues under the framework of high-dimensional and sparse linear regression model. Theoretically, we establish a sharp non-asymptotic estimation error for the global solution and further show that the support of the global solution coincides with the target support with high probability. Motivated by the KKT condition, we propose a primal dual active set algorithm (PDAS) to solve the minimization problem, and show that the proposed PDAS algorithm is essentially a generalized Newton method, which guarantees that the proposed PDAS algorithm will converge fast if a good initial value is provided. Furthermore, we propose a sequential version of the PDAS algorithm (SPDAS) with a warm-start strategy to choose the initial value adaptively. The most significant advantage of the proposed procedure is its fast calculation speed. Extensive numerical studies demonstrate that the proposed method performs well on variable selection and estimation accuracy. It has favorable exhibition over the existing methods in terms of computational speed. As an illustration, we apply the proposed method to a breast cancer gene expression data set.
引用
收藏
页码:2104 / 2122
页数:19
相关论文
共 50 条
  • [41] THE TAP FREE ENERGY FOR HIGH-DIMENSIONAL LINEAR REGRESSION
    Qiu, Jiaze
    Sen, Subhabrata
    ANNALS OF APPLIED PROBABILITY, 2023, 33 (04): : 2643 - 2680
  • [42] High-dimensional analysis of variance in multivariate linear regression
    Lou, Zhipeng
    Zhang, Xianyang
    Wu, Wei Biao
    BIOMETRIKA, 2023, 110 (03) : 777 - 797
  • [43] Leveraging independence in high-dimensional mixed linear regression
    Wang, Ning
    Deng, Kai
    Mai, Qing
    Zhang, Xin
    BIOMETRICS, 2024, 80 (03)
  • [44] Empirical likelihood for high-dimensional linear regression models
    Guo, Hong
    Zou, Changliang
    Wang, Zhaojun
    Chen, Bin
    METRIKA, 2014, 77 (07) : 921 - 945
  • [46] Path Thresholding: Asymptotically Tuning-Free High-Dimensional Sparse Regression
    Vats, Divyanshu
    Baraniuk, Richard G.
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 33, 2014, 33 : 948 - 957
  • [47] SPReM: Sparse Projection Regression Model For High-Dimensional Linear Regression
    Sun, Qiang
    Zhu, Hongtu
    Liu, Yufeng
    Ibrahim, Joseph G.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (509) : 289 - 302
  • [48] Tests for high-dimensional partially linear regression modelsTests for high-dimensional partially linear regression modelsH. Shi et al.
    Hongwei Shi
    Weichao Yang
    Bowen Sun
    Xu Guo
    Statistical Papers, 2025, 66 (3)
  • [49] A stepwise regression algorithm for high-dimensional variable selection
    Hwang, Jing-Shiang
    Hu, Tsuey-Hwa
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (09) : 1793 - 1806
  • [50] Hard Thresholding Regularised Logistic Regression: Theory and Algorithms
    Kang, Lican
    Liu, Yanyan
    Luo, Yuan
    Zhu, Chang
    EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2022, 12 (01) : 35 - 52