Mechanism for feature learning in neural networks and backpropagation-free machine learning models

被引：12

作者：

Radhakrishnan, Adityanarayanan ^{[1
,2
]}

Beaglehole, Daniel ^{[3
]}

Pandit, Parthe ^{[4
,5
]}

Belkin, Mikhail ^{[3
,5
]}

机构：

[1] Harvard Sch Engn & Appl Sci, Cambridge, MA 02138 USA

[2] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA

[3] Univ Calif San Diego, Comp Sci & Engn, La Jolla, CA 92093 USA

[4] Indian Inst Technol, Ctr Machine Intelligence & Data Sci, Mumbai 400076, India

[5] Univ Calif San Diego, Halicioglu Data Sci Inst, La Jolla, CA 92093 USA

来源：

SCIENCE | 2024年 / 383卷 / 6690期

基金：

美国国家科学基金会;

关键词：

REGRESSION;

D O I：

10.1126/science.adi5639

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Understanding how neural networks learn features, or relevant patterns in data, for prediction is necessary for their reliable use in technological and scientific applications. In this work, we presented a unifying mathematical mechanism, known as average gradient outer product (AGOP), that characterized feature learning in neural networks. We provided empirical evidence that AGOP captured features learned by various neural network architectures, including transformer-based language models, convolutional networks, multilayer perceptrons, and recurrent neural networks. Moreover, we demonstrated that AGOP, which is backpropagation-free, enabled feature learning in machine learning models, such as kernel machines, that a priori could not identify task-specific features. Overall, we established a fundamental mechanism that captured feature learning in neural networks and enabled feature learning in general machine learning models.

引用

页码：1461 / 1467

页数：7

共 50 条

[31] Parallel Learning of Feedforward Neural Networks Without Error Backpropagation
Bilski, Jaroslaw
Wilamowski, Bogdan M.
ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2016, 2016, 9692 : 57 - 69
[32] Introduction to Machine Learning, Neural Networks, and Deep Learning
Choi, Rene Y.
Coyner, Aaron S.
Kalpathy-Cramer, Jayashree
Chiang, Michael F.
Campbell, J. Peter
TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2020, 9 (02):
[33] Integrating machine learning with pharmacokinetic models: Benefits of scientific machine learning in adding neural networks components to existing PK models
Valderrama, Diego
Ponce-Bobadilla, Ana Victoria
Mensing, Sven
Froehlich, Holger
Stodtmann, Sven
CPT-PHARMACOMETRICS & SYSTEMS PHARMACOLOGY, 2024, 13 (01): : 41 - 53
[34] Improving backpropagation learning with feature selection
Setiono, R
Liu, H
APPLIED INTELLIGENCE, 1996, 6 (02) : 129 - 139
[35] Kolmogorov width decay and poor approximators in machine learning: shallow neural networks, random feature models and neural tangent kernels
Weinan E
Stephan Wojtowytsch
Research in the Mathematical Sciences, 2021, 8
[36] Kolmogorov width decay and poor approximators in machine learning: shallow neural networks, random feature models and neural tangent kernels
Weinan, E.
Wojtowytsch, Stephan
RESEARCH IN THE MATHEMATICAL SCIENCES, 2021, 8 (01)
[37] Incremental backpropagation learning networks
Fu, LM
Hsu, HH
Principe, JC
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (03): : 757 - 761
[38] Regularized ensemble neural networks models in the Extreme Learning Machine framework
Perales-Gonzalez, Carlos
Carbonero-Ruz, Mariano
Becerra-Alonso, David
Perez-Rodriguez, Javier
Fernandez-Navarro, Francisco
NEUROCOMPUTING, 2019, 361 : 196 - 211
[39] Learning in the Machine: Random Backpropagation and the Deep Learning Channel
Baldi, Pierre
Sadowski, Peter
Lu, Zhiqin
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 6348 - 6352
[40] Learning in the machine: Random backpropagation and the deep learning channel
Baldi, Pierre
Sadowski, Peter
Lu, Zhiqin
ARTIFICIAL INTELLIGENCE, 2018, 260 : 1 - 35

← 1 2 3 4 5 →