Learning Narrow One-Hidden-Layer ReLU Networks

被引：0

作者：

Chen, Sitan ^{[1
]}

Dou, Zehao ^{[2
]}

Goel, Surbhi ^{[3
]}

Klivans, Adam ^{[4
]}

Meka, Raghu ^{[5
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Yale, New Haven, CT USA

[3] Univ Penn, Philadelphia, PA 19104 USA

[4] Univ Texas Austin, Austin, TX 78712 USA

[5] Univ Calif Los Angeles, Los Angeles, CA 90024 USA

来源：

THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195 | 2023年 / 195卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the well-studied problem of learning a linear combination of k ReLU activations with respect to a Gaussian distribution on inputs in d dimensions. We give the first polynomial-time algorithm that succeeds whenever k is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weight vectors being well-conditioned. Our approach is based on analyzing random contractions of higher-order moment tensors. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together, sidestepping the conditioning issues present in prior work. This allows us to design an iterative procedure to discover individual neurons.(1)

引用

页数：35

共 50 条

[41] BYY learning, regularized implementation, and model selection on modular networks with one hidden layer of binary units
Xu, L
NEUROCOMPUTING, 2003, 51 : 277 - 301
[42] A sequential learning approach for single hidden layer neural networks
Zhang, J
Morris, AJ
NEURAL NETWORKS, 1998, 11 (01) : 65 - 80
[43] Simultaneous perturbation for single hidden layer networks - cascade learning
Thangavel, P
Kathirvalavakumar, T
NEUROCOMPUTING, 2003, 50 : 193 - 209
[44] Approximation Algorithms Using Generalized Translation Networks with One Hidden Layer
Hahm, Nahmwoo
Hong, Bum Il
Yoo, Intae R.
JOURNAL OF THE KOREAN PHYSICAL SOCIETY, 2011, 58 (02) : 174 - 181
[45] Adversarial Examples in Multi-Layer Random ReLU Networks
Bartlett, Peter L.
Bubeck, Sebastien
Cherapanamjeri, Yeshwanth
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[46] Annihilation of Spurious Minima in Two-Layer ReLU Networks
Arjevani, Yossi
Field, Michael
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[47] Simultaneous approximations of multivariate functions and their derivatives by neural networks with one hidden layer
Li, X
NEUROCOMPUTING, 1996, 12 (04) : 327 - 343
[48] Convergence Analysis of Two-layer Neural Networks with ReLU Activation
Li, Yuanzhi
Yuan, Yang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[49] A Global Universality of Two-Layer Neural Networks with ReLU Activations
Hatano, Naoya
Ikeda, Masahiro
Ishikawa, Isao
Sawano, Yoshihiro
JOURNAL OF FUNCTION SPACES, 2021, 2021
[50] Learning Deep ReLU Networks Is Fixed-Parameter Tractable
Chen, Sitan
Klivans, Adam R.
Meka, Raghu
2021 IEEE 62ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2021), 2022, : 696 - 707

← 1 2 3 4 5 →