Robust adaptive LASSO in high-dimensional logistic regression

被引:0
|
作者
Basu, Ayanendranath [1 ]
Ghosh, Abhik [1 ]
Jaenada, Maria [2 ]
Pardo, Leandro [2 ]
机构
[1] Indian Stat Inst, Interdisciplinary Stat Res Unit, 203 BT Rd, Kolkata 700108, India
[2] Univ Complutense Madrid, Stat & OR, Plaza Ciencias 3, Madrid 28040, Spain
关键词
Density power divergence; High-dimensional data; Logistic regression; Oracle properties; Variable selection; VARIABLE SELECTION; GENE SELECTION; SPARSE REGRESSION; CLASSIFICATION; CANCER; MICROARRAYS; LIKELIHOOD; ALGORITHM; MODELS;
D O I
10.1007/s10260-024-00760-2
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Penalized logistic regression is extremely useful for binary classification with large number of covariates (higher than the sample size), having several real life applications, including genomic disease classification. However, the existing methods based on the likelihood loss function are sensitive to data contamination and other noise and, hence, robust methods are needed for stable and more accurate inference. In this paper, we propose a family of robust estimators for sparse logistic models utilizing the popular density power divergence based loss function and the general adaptively weighted LASSO penalties. We study the local robustness of the proposed estimators through its influence function and also derive its oracle properties and asymptotic distribution. With extensive empirical illustrations, we demonstrate the significantly improved performance of our proposed estimators over the existing ones with particular gain in robustness. Our proposal is finally applied to analyse four different real datasets for cancer classification, obtaining robust and accurate models, that simultaneously performs gene selection and patient classification.
引用
收藏
页数:33
相关论文
共 50 条
  • [31] EFFICIENT FUNCTIONAL LASSO KERNEL SMOOTHING FOR HIGH-DIMENSIONAL ADDITIVE REGRESSION
    Lee, Eun Ryung
    Park, Seyoung
    Mammen, Enno
    Park, Byeong U.
    ANNALS OF STATISTICS, 2024, 52 (04): : 1741 - 1773
  • [32] Efficient posterior sampling for high-dimensional imbalanced logistic regression
    Sen, Deborshee
    Sachs, Matthias
    Lu, Jianfeng
    Dunson, David B.
    BIOMETRIKA, 2020, 107 (04) : 1005 - 1012
  • [33] On inference in high-dimensional logistic regression models with separated data
    Lewis, R. M.
    Battey, H. S.
    BIOMETRIKA, 2024, 111 (03)
  • [34] Classification of High-Dimensional Data with Ensemble of Logistic Regression Models
    Lim, Noha
    Ahn, Hongshik
    Moon, Hojin
    Chen, James J.
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2010, 20 (01) : 160 - 171
  • [35] The Early Warning of Financial Failure for Iraqi Banks Based on Robust Adaptive Lasso Logistic Regression
    Abbas A.J.
    Uraibi H.S.
    Iraqi Journal for Computer Science and Mathematics, 2024, 5 (01): : 112 - 124
  • [36] Robust high-dimensional regression for data with anomalous responses
    Mingyang Ren
    Sanguo Zhang
    Qingzhao Zhang
    Annals of the Institute of Statistical Mathematics, 2021, 73 : 703 - 736
  • [37] Robust Estimation of High-Dimensional Linear Regression With Changepoints
    Cui, Xiaolong
    Geng, Haoyu
    Wang, Zhaojun
    Zou, Changliang
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (10) : 7297 - 7319
  • [38] Robust high-dimensional regression for data with anomalous responses
    Ren, Mingyang
    Zhang, Sanguo
    Zhang, Qingzhao
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2021, 73 (04) : 703 - 736
  • [39] Robust linear regression for high-dimensional data: An overview
    Filzmoser, Peter
    Nordhausen, Klaus
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2021, 13 (04)
  • [40] Modified adaptive group lasso for high-dimensional varying coefficient models
    Wang, Mingqiu
    Kang, Xiaoning
    Tian, Guo-Liang
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2022, 51 (11) : 6495 - 6510