A MODEL OF DOUBLE DESCENT FOR HIGH-DIMENSIONAL LOGISTIC REGRESSION

被引:0
|
作者
Deng, Zeyu [1 ]
Kammoun, Abla [2 ]
Thrampoulidis, Christos [1 ]
机构
[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
[2] King Abdullah Univ Sci & Technol, Thuwal, Saudi Arabia
关键词
Generalization error; Binary Classification; Overparameterization; Max-margin; Asymptotics;
D O I
10.1109/icassp40776.2020.9053524
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We consider a model for logistic regression where only a subset of features of size p is used for training a linear classifier over n training samples. The classifier is obtained by running gradient-descent (GD) on the logistic-loss. For this model, we investigate the dependence of the classification error on the overparameterization ratio kappa = p/n. First, building on known deterministic results on convergence properties of the GD, we uncover a phase-transition phenomenon for the case of Gaussian features: the classification error of GD is the same as that of the maximum-likelihood (ML) solution when kappa < kappa(star), and that of the max-margin (SVM) solution when kappa < kappa(star). Next, using the convex Gaussian min-max theorem (CGMT), we sharply characterize the performance of both the ML and SVM solutions. Combining these results, we obtain curves that explicitly characterize the test error of GD for varying values of kappa. The numerical results validate the theoretical predictions and unveil "double-descent" phenomena that complement similar recent observations in linear regression settings.
引用
收藏
页码:4267 / 4271
页数:5
相关论文
共 50 条
  • [41] On inference in high-dimensional regression
    Battey, Heather S.
    Reid, Nancy
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2023, 85 (01) : 149 - 175
  • [42] The cross-validated AUC for MCP-logistic regression with high-dimensional data
    Jiang, Dingfeng
    Huang, Jian
    Zhang, Ying
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2013, 22 (05) : 505 - 518
  • [43] Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
    Algamal, Zakariya Yahya
    Lee, Muhammad Hisyam
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (23) : 9326 - 9332
  • [44] High-dimensional pseudo-logistic regression and classification with applications to gene expression data
    Zhang, Chunming
    Fu, Haoda
    Jiang, Yuan
    Yu, Tao
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 452 - 470
  • [45] Sparse Bayesian variable selection in high-dimensional logistic regression models with correlated priors
    Ma, Zhuanzhuan
    Han, Zifei
    Ghosh, Souparno
    Wu, Liucang
    Wang, Min
    STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
  • [46] Using synthetic data and dimensionality reduction in high-dimensional classification via logistic regression
    Zarei, Shaho
    Mohammadpour, Adel
    COMPUTATIONAL METHODS FOR DIFFERENTIAL EQUATIONS, 2019, 7 (04): : 626 - 634
  • [47] Converting high-dimensional regression to high-dimensional conditional density estimation
    Izbicki, Rafael
    Lee, Ann B.
    ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 2800 - 2831
  • [48] A minimax optimal approach to high-dimensional double sparse linear regression
    Zhang, Yanhang
    Li, Zhifan
    Liu, Shixiang
    Yin, Jianxin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 66
  • [49] Model Selection for High-Dimensional Quadratic Regression via Regularization
    Hao, Ning
    Feng, Yang
    Zhang, Hao Helen
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (522) : 615 - 625
  • [50] HIGH-DIMENSIONAL VARYING INDEX COEFFICIENT QUANTILE REGRESSION MODEL
    Lv, Jing
    Li, Jialiang
    STATISTICA SINICA, 2022, 32 (02) : 673 - 694