Learning Narrow One-Hidden-Layer ReLU Networks

被引：0

作者：

Chen, Sitan ^{[1
]}

Dou, Zehao ^{[2
]}

Goel, Surbhi ^{[3
]}

Klivans, Adam ^{[4
]}

Meka, Raghu ^{[5
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Yale, New Haven, CT USA

[3] Univ Penn, Philadelphia, PA 19104 USA

[4] Univ Texas Austin, Austin, TX 78712 USA

[5] Univ Calif Los Angeles, Los Angeles, CA 90024 USA

来源：

THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195 | 2023年 / 195卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the well-studied problem of learning a linear combination of k ReLU activations with respect to a Gaussian distribution on inputs in d dimensions. We give the first polynomial-time algorithm that succeeds whenever k is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weight vectors being well-conditioned. Our approach is based on analyzing random contractions of higher-order moment tensors. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together, sidestepping the conditioning issues present in prior work. This allows us to design an iterative procedure to discover individual neurons.(1)

引用

页数：35

共 50 条

[1] Learning One-hidden-layer ReLU Networks via Gradient Descent
Zhang, Xiao
Yu, Yaodong
Wang, Lingxiao
Gu, Quanquan
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[2] Understanding the loss landscape of one-hidden-layer ReLU networks
Liu, Bo
KNOWLEDGE-BASED SYSTEMS, 2021, 220
[3] On the landscape of one-hidden-layer sparse networks and beyond
Lin, Dachao
Sun, Ruoyu
Zhang, Zhihua
ARTIFICIAL INTELLIGENCE, 2022, 309
[4] Large deviations of one-hidden-layer neural networks
Hirsch, Christian
Willhalm, Daniel
STOCHASTICS AND DYNAMICS, 2024, 24 (08)
[5] Learning One-hidden-layer Neural Networks under General Input Distributions
Gao, Weihao
Makkuva, Ashok Vardhan
Oh, Sewoong
Viswanath, Pramod
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[6] Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks
Cao, Yuan
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[7] The Sample Complexity of One-Hidden-Layer Neural Networks
Vardi, Gal
Shamir, Ohad
Srebro, Nathan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[8] Recovery Guarantees for One-hidden-layer Neural Networks
Zhong, Kai
Song, Zhao
Jain, Prateek
Bartlett, Peter L.
Dhillon, Inderjit S.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[9] Local Geometry of Cross Entropy Loss in Learning One-Hidden-Layer Neural Networks
Fu, Haoyu
Chi, Yuejie
Liang, Yingbin
2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 1972 - 1976
[10] Analysis of one-hidden-layer Neural Networks via the Resolvent Method
Piccolo, Vanessa
Schroder, Dominik
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →