Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

被引:0
|
作者
Magen, Roey [1 ]
Shamir, Ohad [1 ]
机构
[1] Weizmann Inst Sci, Rehovot, Israel
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix W0 is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] On the sample complexity for nonoverlapping neural networks
    Schmitt, M
    MACHINE LEARNING, 1999, 37 (02) : 131 - 141
  • [2] On the Sample Complexity for Nonoverlapping Neural Networks
    Michael Schmitt
    Machine Learning, 1999, 37 : 131 - 141
  • [3] The Sample Complexity of Learning Linear Predictors with the Squared Loss
    Shamir, Ohad
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 3475 - 3486
  • [4] Data-dependent sample complexity of deep neural networks via lipschitz augmentation
    Wei, Colin
    Ma, Tengyu
    Advances in Neural Information Processing Systems, 2019, 32
  • [5] Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
    Wei, Colin
    Ma, Tengyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] A weight initialization based on the linear product structure for neural networks
    Chen, Qipin
    Hao, Wenrui
    He, Juncai
    APPLIED MATHEMATICS AND COMPUTATION, 2022, 415
  • [7] Size-independent sample complexity of neural networks
    Golowich, Noah
    Rakhlin, Alexander
    Shamir, Ohad
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (02) : 473 - 504
  • [8] Solving the linear interval tolerance problem for weight initialization of neural networks
    Adam, S. P.
    Karras, D. A.
    Magoulas, G. D.
    Vrahatis, M. N.
    NEURAL NETWORKS, 2014, 54 : 17 - 37
  • [9] The Sample Complexity of One-Hidden-Layer Neural Networks
    Vardi, Gal
    Shamir, Ohad
    Srebro, Nathan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [10] Construction and initialization of a hidden layer of multilayer neural networks using linear programming
    Kim, LS
    CRITICAL TECHNOLOGY: PROCEEDINGS OF THE THIRD WORLD CONGRESS ON EXPERT SYSTEMS, VOLS I AND II, 1996, : 986 - 992