Enmsp: an elastic-net multi-step screening procedure for high-dimensional regression

被引:0
|
作者
Xue, Yushan [1 ]
Ren, Jie [2 ]
Yang, Bin [3 ]
机构
[1] Cent Univ Finance & Econ, Sch Stat & Math, Beijing, Peoples R China
[2] HollySys Grp Co Ltd, Beijing, Peoples R China
[3] Res Ctr Int Inspection & Quarantine Stand & Tech R, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
High-dimensional data; Correlated effects; Elastic-net; Iterative algorithm; EnMSP; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; LASSO; OPTIMALITY;
D O I
10.1007/s11222-024-10394-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To improve the estimation efficiency of high-dimensional regression problems, penalized regularization is routinely used. However, accurately estimating the model remains challenging, particularly in the presence of correlated effects, wherein irrelevant covariates exhibit strong correlation with relevant ones. This situation, referred to as correlated data, poses additional complexities for model estimation. In this paper, we propose the elastic-net multi-step screening procedure (EnMSP), an iterative algorithm designed to recover sparse linear models in the context of correlated data. EnMSP uses a small repeated penalty strategy to identify truly relevant covariates in a few iterations. Specifically, in each iteration, EnMSP enhances the adaptive lasso method by adding a weighted l2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_2$$\end{document} penalty, which improves the selection of relevant covariates. The method is shown to select the true model and achieve the l2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_2$$\end{document}-norm error bound under certain conditions. The effectiveness of EnMSP is demonstrated through numerical comparisons and applications in financial data.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Integrative rank-based regression for multi-source high-dimensional data with multi-type responses
    Xu, Fuzhi
    Ma, Shuangge
    Zhang, Qingzhao
    JOURNAL OF APPLIED STATISTICS, 2025,
  • [42] A procedure of linear discrimination analysis with detected sparsity structure for high-dimensional multi-class classification
    Luo, Shan
    Chen, Zehua
    JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 179
  • [43] A robust elastic net via bootstrap method under sampling uncertainty for significance analysis of high-dimensional design problems
    Kim, Hansu
    Lee, Tae Hee
    KNOWLEDGE-BASED SYSTEMS, 2021, 225
  • [44] VARIABLE SELECTION FOR SPARSE HIGH-DIMENSIONAL NONLINEAR REGRESSION MODELS BY COMBINING NONNEGATIVE GARROTE AND SURE INDEPENDENCE SCREENING
    Wu, Shuang
    Xue, Hongqi
    Wu, Yichao
    Wu, Hulin
    STATISTICA SINICA, 2014, 24 (03) : 1365 - 1387
  • [45] Multi-kernel Gaussian process latent variable regression model for high-dimensional sequential data modeling
    Zhu, Ziqi
    Zhang, Jiayuan
    Zou, Jixin
    Deng, Chunhua
    NEUROCOMPUTING, 2019, 348 : 3 - 15
  • [46] A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies
    Zhang, Jin
    Chen, Min
    Wen, Yangjun
    Zhang, Yin
    Lu, Yunan
    Wang, Shengmeng
    Chen, Juncong
    FRONTIERS IN GENETICS, 2021, 12
  • [47] Adaptive Elastic Net Based on Modified PSO for Variable Selection in Cox Model With High-Dimensional Data: A Comprehensive Simulation Study
    Sancar, Nuriye
    Onakpojeruo, Efe Precious
    Inan, Deniz
    Ozsahin, Dilber Uzun
    IEEE ACCESS, 2023, 11 : 127302 - 127316
  • [48] Rates of convergence of the adaptive elastic net and the post-selection procedure in ultra-high dimensional sparse models
    Yang, Yuehan
    Yang, Hu
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (01) : 73 - 94
  • [49] MOKBL plus MOMs: An interpretable multi-objective evolutionary fuzzy system for learning high-dimensional regression data
    Aghaeipoor, Fatemeh
    Javidi, Mohammad Masoud
    INFORMATION SCIENCES, 2019, 496 : 1 - 24
  • [50] Comparison of immediate germline sequencing and multi-step screening for Lynch syndrome detection in high-risk endometrial and colorectal cancer patients
    Chao, An-Shine
    Chao, Angel
    Lai, Chyong-Huey
    Lin, Chiao-Yun
    Yang, Lan-Yan
    Chang, Shih-Cheng
    Wu, Ren-Chin
    JOURNAL OF GYNECOLOGIC ONCOLOGY, 2024, 35 (01)