Learning Narrow One-Hidden-Layer ReLU Networks

被引:0
|
作者
Chen, Sitan [1 ]
Dou, Zehao [2 ]
Goel, Surbhi [3 ]
Klivans, Adam [4 ]
Meka, Raghu [5 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Yale, New Haven, CT USA
[3] Univ Penn, Philadelphia, PA 19104 USA
[4] Univ Texas Austin, Austin, TX 78712 USA
[5] Univ Calif Los Angeles, Los Angeles, CA 90024 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the well-studied problem of learning a linear combination of k ReLU activations with respect to a Gaussian distribution on inputs in d dimensions. We give the first polynomial-time algorithm that succeeds whenever k is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weight vectors being well-conditioned. Our approach is based on analyzing random contractions of higher-order moment tensors. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together, sidestepping the conditioning issues present in prior work. This allows us to design an iterative procedure to discover individual neurons.(1)
引用
收藏
页数:35
相关论文
共 50 条
  • [31] Threefold vs. fivefold cross validation in one-hidden-layer and two-hidden-layer predictive neural network modeling of machining surface roughness data
    Feng, Chang-Xue Jack
    Yu, Zhi-Guang
    Kingi, Unnati
    Baig, M. Pervaiz
    JOURNAL OF MANUFACTURING SYSTEMS, 2005, 24 (02) : 93 - 107
  • [32] Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks
    Kalan, Seyed Mohammadreza Mousavi
    Fabian, Zalan
    Avestimehr, Salman
    Soltanolkotabi, Mahdi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [33] Learning Distributions Generated by Single-Layer ReLU Networks in the Presence of Arbitrary Outliers
    Bulusu, Saikiran
    Joseph, Geethu
    Gursoy, M. Cenk
    Varshney, Pramod K.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [34] OPTIMAL APPROXIMATION OF SQUARE INTEGRABLE FUNCTIONS BY A FLEXIBLE ONE-HIDDEN-LAYER NEURAL NETWORK OF EXCITATORY AND INHIBITORY NEURON PAIRS
    GIRAUD, B
    LIU, LC
    BERNARD, C
    AXELRAD, H
    NEURAL NETWORKS, 1991, 4 (06) : 803 - 815
  • [35] Learning Two-Layer ReLU Networks Is Nearly as Easy as Learning Linear Classifiers on Separable Data
    Yang, Qiuling
    Sadeghi, Alireza
    Wang, Gang
    Sun, Jian
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 4416 - 4427
  • [36] Characterization of degree of approximation for neural networks with one hidden layer
    Cao, Fei-Long
    Xu, Zong-Ben
    He, Man-Xi
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 2944 - +
  • [37] APPROXIMATION BY RIDGE FUNCTIONS AND NEURAL NETWORKS WITH ONE HIDDEN LAYER
    CHUI, CK
    LI, X
    JOURNAL OF APPROXIMATION THEORY, 1992, 70 (02) : 131 - 141
  • [38] Limitations of the approximation capabilities of neural networks with one hidden layer
    Chui, CK
    Li, X
    Mhaskar, HN
    ADVANCES IN COMPUTATIONAL MATHEMATICS, 1996, 5 (2-3) : 233 - 243
  • [39] A RELU DENSE LAYER TO IMPROVE THE PERFORMANCE OF NEURAL NETWORKS
    Javid, Alireza M.
    Das, Sandipan
    Skoglund, Mikael
    Chatterjee, Saikat
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2810 - 2814
  • [40] How Does Promoting the Minority Fraction Affect Generalization? A Theoretical Study of One-Hidden-Layer Neural Network on Group Imbalance
    Li, Hongkang
    Zhang, Shuai
    Zhang, Yihua
    Wang, Meng
    Liu, Sijia
    Chen, Pin-Yu
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (02) : 216 - 231