Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

被引：0

作者：

Magen, Roey ^{[1
]}

Shamir, Ohad ^{[1
]}

机构：

[1] Weizmann Inst Sci, Rehovot, Israel

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

欧洲研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix W0 is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.

引用

页数：27

共 50 条

[1] On the sample complexity for nonoverlapping neural networks
Schmitt, M
MACHINE LEARNING, 1999, 37 (02) : 131 - 141
[2] On the Sample Complexity for Nonoverlapping Neural Networks
Michael Schmitt
Machine Learning, 1999, 37 : 131 - 141
[3] The Sample Complexity of Learning Linear Predictors with the Squared Loss
Shamir, Ohad
JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 3475 - 3486
[4] Data-dependent sample complexity of deep neural networks via lipschitz augmentation
Wei, Colin
Ma, Tengyu
Advances in Neural Information Processing Systems, 2019, 32
[5] Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Wei, Colin
Ma, Tengyu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[6] A weight initialization based on the linear product structure for neural networks
Chen, Qipin
Hao, Wenrui
He, Juncai
APPLIED MATHEMATICS AND COMPUTATION, 2022, 415
[7] Size-independent sample complexity of neural networks
Golowich, Noah
Rakhlin, Alexander
Shamir, Ohad
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (02) : 473 - 504
[8] Solving the linear interval tolerance problem for weight initialization of neural networks
Adam, S. P.
Karras, D. A.
Magoulas, G. D.
Vrahatis, M. N.
NEURAL NETWORKS, 2014, 54 : 17 - 37
[9] The Sample Complexity of One-Hidden-Layer Neural Networks
Vardi, Gal
Shamir, Ohad
Srebro, Nathan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[10] Construction and initialization of a hidden layer of multilayer neural networks using linear programming
Kim, LS
CRITICAL TECHNOLOGY: PROCEEDINGS OF THE THIRD WORLD CONGRESS ON EXPERT SYSTEMS, VOLS I AND II, 1996, : 986 - 992

← 1 2 3 4 5 →