Elastic net-based high dimensional data selection for regression

被引:10
|
作者
Chamlal, Hasna [1 ]
Benzmane, Asmaa [1 ]
Ouaderhman, Tayeb [1 ]
机构
[1] Hassan II Univ, Fac Sci Ain Chock, Dept Math & Informat, Fundamental & Appl Math Lab, Casablanca, Morocco
关键词
Feature screening; Regression; Rank correlation; High-dimensional data; Elastic net; VIEW; REGULARIZATION; ALGORITHM;
D O I
10.1016/j.eswa.2023.122958
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High -dimensional feature selection is of particular interest to researchers. In some domains, such as microarray data, it is quite common for a group of highly correlated explanatory variables to be of equal importance for inclusion in the predictive model. This paper proposes a new hybrid feature selection approach that integrates feature screening based on Kendall's tau and Elastic Net regularized regression (K -EN). K -EN as an approach that embeds the Elastic Net, has the advantage of the grouping effect, which automatically includes all the highly correlated variables in the group. The K -EN approach offers insightful solutions to high -dimensional regression problems and improves Elastic Net performance since the screening phase is preceded by a step that further reduces the number of explanatory variables by removing those that disagree with the target based on Kendall's tau. The use of Kendall's tau further enhances Elastic Net performance, as it is robust enough to handle heavy-tailed distributions, non-parametric models, outliers, and non-normal data with greater ease. K -EN is therefore a time-saving approach. The proposed algorithm is evaluated on four simulation scenarios and four publicly available datasets, including riboflavin, eyedata, Longley, and Boston Housing, and achieves 0.2528, 0.0098, 0.1007, and 0.4121 respectively as the Mean Squared Error (MSE). K-EN's MSEs are the best compared to those achieved by the state-of-the-art approaches reviewed in this paper. In addition, K -EN selects up to 100% of relevant features when run on simulated data.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Variable Selection in a Log-Linear Birnbaum-Saunders Regression Model for High-Dimensional Survival Data via the Elastic-Net and Stochastic EM
    Zhang, Yukun
    Lu, Xuewen
    Desmond, Anthony F.
    TECHNOMETRICS, 2016, 58 (03) : 383 - 392
  • [22] Prediction Model for Cutterhead Rotation Speed Based on Dimensional Analysis and Elastic Net Regression
    Liu, Junsheng
    Liang, Feng
    Wei, Kai
    Zuo, Changqun
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [23] Handling net-based gripes
    Krause, J
    ABA JOURNAL, 2003, 89 : 65 - 65
  • [24] High-dimensional variable selection in regression and classification with missing data
    Gao, Qi
    Lee, Thomas C. M.
    SIGNAL PROCESSING, 2017, 131 : 1 - 7
  • [25] VARIANCE ESTIMATION IN HIGH-DIMENSIONAL LINEAR REGRESSION VIA ADAPTIVE ELASTIC-NET
    Wang, Xin
    Kong, Lingchen
    Zhuang, Xinying
    Wang, Liqun
    JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2024, 20 (02) : 630 - 646
  • [26] High-dimensional index tracking based on the adaptive elastic net
    Shu, Lianjie
    Shi, Fangquan
    Tian, Guoliang
    QUANTITATIVE FINANCE, 2020, 20 (09) : 1513 - 1530
  • [27] Elastic net wavelength interval selection based on iterative rank PLS regression coefficient screening
    Huang, Xin
    Luo, Yi-Ping
    Xu, Qing-Song
    Liang, Yi-Zeng
    ANALYTICAL METHODS, 2017, 9 (04) : 672 - 679
  • [28] Adaptive elastic net-penalized quantile regression for variable selection
    Yan, Ailing
    Song, Fengli
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (20) : 5106 - 5120
  • [29] Using elastic net regression to perform spectrally relevant variable selection
    Giglio, Cannon
    Brown, Steven D.
    JOURNAL OF CHEMOMETRICS, 2018, 32 (08)
  • [30] Data completeness: A key to effective net-based customer service systems
    Brohman, MK
    Watson, RT
    Piccoli, G
    Parasuraman, A
    COMMUNICATIONS OF THE ACM, 2003, 46 (06) : 47 - 51