A MODEL OF DOUBLE DESCENT FOR HIGH-DIMENSIONAL LOGISTIC REGRESSION

被引：0

作者：

Deng, Zeyu ^{[1
]}

Kammoun, Abla ^{[2
]}

Thrampoulidis, Christos ^{[1
]}

机构：

[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA

[2] King Abdullah Univ Sci & Technol, Thuwal, Saudi Arabia

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

Generalization error; Binary Classification; Overparameterization; Max-margin; Asymptotics;

D O I：

10.1109/icassp40776.2020.9053524

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We consider a model for logistic regression where only a subset of features of size p is used for training a linear classifier over n training samples. The classifier is obtained by running gradient-descent (GD) on the logistic-loss. For this model, we investigate the dependence of the classification error on the overparameterization ratio kappa = p/n. First, building on known deterministic results on convergence properties of the GD, we uncover a phase-transition phenomenon for the case of Gaussian features: the classification error of GD is the same as that of the maximum-likelihood (ML) solution when kappa < kappa(star), and that of the max-margin (SVM) solution when kappa < kappa(star). Next, using the convex Gaussian min-max theorem (CGMT), we sharply characterize the performance of both the ML and SVM solutions. Combining these results, we obtain curves that explicitly characterize the test error of GD for varying values of kappa. The numerical results validate the theoretical predictions and unveil "double-descent" phenomena that complement similar recent observations in linear regression settings.

引用

页码：4267 / 4271

页数：5

共 50 条

[41] On inference in high-dimensional regression
Battey, Heather S.
Reid, Nancy
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2023, 85 (01) : 149 - 175
[42] The cross-validated AUC for MCP-logistic regression with high-dimensional data
Jiang, Dingfeng
Huang, Jian
Zhang, Ying
STATISTICAL METHODS IN MEDICAL RESEARCH, 2013, 22 (05) : 505 - 518
[43] Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
Algamal, Zakariya Yahya
Lee, Muhammad Hisyam
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (23) : 9326 - 9332
[44] High-dimensional pseudo-logistic regression and classification with applications to gene expression data
Zhang, Chunming
Fu, Haoda
Jiang, Yuan
Yu, Tao
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 452 - 470
[45] Sparse Bayesian variable selection in high-dimensional logistic regression models with correlated priors
Ma, Zhuanzhuan
Han, Zifei
Ghosh, Souparno
Wu, Liucang
Wang, Min
STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
[46] Using synthetic data and dimensionality reduction in high-dimensional classification via logistic regression
Zarei, Shaho
Mohammadpour, Adel
COMPUTATIONAL METHODS FOR DIFFERENTIAL EQUATIONS, 2019, 7 (04): : 626 - 634
[47] Converting high-dimensional regression to high-dimensional conditional density estimation
Izbicki, Rafael
Lee, Ann B.
ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 2800 - 2831
[48] A minimax optimal approach to high-dimensional double sparse linear regression
Zhang, Yanhang
Li, Zhifan
Liu, Shixiang
Yin, Jianxin
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 66
[49] Model Selection for High-Dimensional Quadratic Regression via Regularization
Hao, Ning
Feng, Yang
Zhang, Hao Helen
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (522) : 615 - 625
[50] HIGH-DIMENSIONAL VARYING INDEX COEFFICIENT QUANTILE REGRESSION MODEL
Lv, Jing
Li, Jialiang
STATISTICA SINICA, 2022, 32 (02) : 673 - 694

← 1 2 3 4 5 →