A MODEL OF DOUBLE DESCENT FOR HIGH-DIMENSIONAL LOGISTIC REGRESSION

被引:0
|
作者
Deng, Zeyu [1 ]
Kammoun, Abla [2 ]
Thrampoulidis, Christos [1 ]
机构
[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
[2] King Abdullah Univ Sci & Technol, Thuwal, Saudi Arabia
关键词
Generalization error; Binary Classification; Overparameterization; Max-margin; Asymptotics;
D O I
10.1109/icassp40776.2020.9053524
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We consider a model for logistic regression where only a subset of features of size p is used for training a linear classifier over n training samples. The classifier is obtained by running gradient-descent (GD) on the logistic-loss. For this model, we investigate the dependence of the classification error on the overparameterization ratio kappa = p/n. First, building on known deterministic results on convergence properties of the GD, we uncover a phase-transition phenomenon for the case of Gaussian features: the classification error of GD is the same as that of the maximum-likelihood (ML) solution when kappa < kappa(star), and that of the max-margin (SVM) solution when kappa < kappa(star). Next, using the convex Gaussian min-max theorem (CGMT), we sharply characterize the performance of both the ML and SVM solutions. Combining these results, we obtain curves that explicitly characterize the test error of GD for varying values of kappa. The numerical results validate the theoretical predictions and unveil "double-descent" phenomena that complement similar recent observations in linear regression settings.
引用
收藏
页码:4267 / 4271
页数:5
相关论文
共 50 条
  • [21] Robust Variable Selection with Optimality Guarantees for High-Dimensional Logistic Regression
    Insolia, Luca
    Kenney, Ana
    Calovi, Martina
    Chiaromonte, Francesca
    STATS, 2021, 4 (03): : 665 - 681
  • [22] SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression
    Yadlowsky, Steve
    Yun, Taedong
    McLean, Cory
    D'Amour, Alexander
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [23] Outlier detection in high-dimensional regression model
    Wang, Tao
    Li, Zhonghua
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (14) : 6947 - 6958
  • [24] High-dimensional model averaging for quantile regression
    Xie, Jinhan
    Ding, Xianwen
    Jiang, Bei
    Yan, Xiaodong
    Kong, Linglong
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2024, 52 (02): : 618 - 635
  • [25] An Iterative Coordinate Descent Algorithm for High-Dimensional Nonconvex Penalized Quantile Regression
    Peng, Bo
    Wang, Lan
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2015, 24 (03) : 676 - 694
  • [26] Semi-Supervised Factored Logistic Regression for High-Dimensional Neuroimaging Data
    Bzdok, Danilo
    Eickenberg, Michael
    Grisel, Olivier
    Thirion, Bertrand
    Varoquaux, Gael
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [27] FDR control and power analysis for high-dimensional logistic regression via StabKoff
    Yuan, Panxu
    Kong, Yinfei
    Li, Gaorong
    STATISTICAL PAPERS, 2024, 65 (05) : 2719 - 2749
  • [28] Using principal components for estimating logistic regression with high-dimensional multicollinear data
    Aguilera, AM
    Escabias, M
    Valderrama, MJ
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (08) : 1905 - 1924
  • [29] Finite population Bayesian bootstrapping in high-dimensional classification via logistic regression
    Zarei, Shaho
    Mohammadpour, Adel
    Rezakhah, Saeid
    INTELLIGENT DATA ANALYSIS, 2018, 22 (05) : 1115 - 1126
  • [30] Logistic regression error-in-covariate models for longitudinal high-dimensional covariates
    Park, Hyung
    Lee, Seonjoo
    STAT, 2019, 8 (01):