Mechanism for feature learning in neural networks and backpropagation-free machine learning models

被引:12
|
作者
Radhakrishnan, Adityanarayanan [1 ,2 ]
Beaglehole, Daniel [3 ]
Pandit, Parthe [4 ,5 ]
Belkin, Mikhail [3 ,5 ]
机构
[1] Harvard Sch Engn & Appl Sci, Cambridge, MA 02138 USA
[2] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[3] Univ Calif San Diego, Comp Sci & Engn, La Jolla, CA 92093 USA
[4] Indian Inst Technol, Ctr Machine Intelligence & Data Sci, Mumbai 400076, India
[5] Univ Calif San Diego, Halicioglu Data Sci Inst, La Jolla, CA 92093 USA
基金
美国国家科学基金会;
关键词
REGRESSION;
D O I
10.1126/science.adi5639
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Understanding how neural networks learn features, or relevant patterns in data, for prediction is necessary for their reliable use in technological and scientific applications. In this work, we presented a unifying mathematical mechanism, known as average gradient outer product (AGOP), that characterized feature learning in neural networks. We provided empirical evidence that AGOP captured features learned by various neural network architectures, including transformer-based language models, convolutional networks, multilayer perceptrons, and recurrent neural networks. Moreover, we demonstrated that AGOP, which is backpropagation-free, enabled feature learning in machine learning models, such as kernel machines, that a priori could not identify task-specific features. Overall, we established a fundamental mechanism that captured feature learning in neural networks and enabled feature learning in general machine learning models.
引用
收藏
页码:1461 / 1467
页数:7
相关论文
共 50 条
  • [31] Parallel Learning of Feedforward Neural Networks Without Error Backpropagation
    Bilski, Jaroslaw
    Wilamowski, Bogdan M.
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2016, 2016, 9692 : 57 - 69
  • [32] Introduction to Machine Learning, Neural Networks, and Deep Learning
    Choi, Rene Y.
    Coyner, Aaron S.
    Kalpathy-Cramer, Jayashree
    Chiang, Michael F.
    Campbell, J. Peter
    TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2020, 9 (02):
  • [33] Integrating machine learning with pharmacokinetic models: Benefits of scientific machine learning in adding neural networks components to existing PK models
    Valderrama, Diego
    Ponce-Bobadilla, Ana Victoria
    Mensing, Sven
    Froehlich, Holger
    Stodtmann, Sven
    CPT-PHARMACOMETRICS & SYSTEMS PHARMACOLOGY, 2024, 13 (01): : 41 - 53
  • [34] Improving backpropagation learning with feature selection
    Setiono, R
    Liu, H
    APPLIED INTELLIGENCE, 1996, 6 (02) : 129 - 139
  • [35] Kolmogorov width decay and poor approximators in machine learning: shallow neural networks, random feature models and neural tangent kernels
    Weinan E
    Stephan Wojtowytsch
    Research in the Mathematical Sciences, 2021, 8
  • [36] Kolmogorov width decay and poor approximators in machine learning: shallow neural networks, random feature models and neural tangent kernels
    Weinan, E.
    Wojtowytsch, Stephan
    RESEARCH IN THE MATHEMATICAL SCIENCES, 2021, 8 (01)
  • [37] Incremental backpropagation learning networks
    Fu, LM
    Hsu, HH
    Principe, JC
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (03): : 757 - 761
  • [38] Regularized ensemble neural networks models in the Extreme Learning Machine framework
    Perales-Gonzalez, Carlos
    Carbonero-Ruz, Mariano
    Becerra-Alonso, David
    Perez-Rodriguez, Javier
    Fernandez-Navarro, Francisco
    NEUROCOMPUTING, 2019, 361 : 196 - 211
  • [39] Learning in the Machine: Random Backpropagation and the Deep Learning Channel
    Baldi, Pierre
    Sadowski, Peter
    Lu, Zhiqin
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 6348 - 6352
  • [40] Learning in the machine: Random backpropagation and the deep learning channel
    Baldi, Pierre
    Sadowski, Peter
    Lu, Zhiqin
    ARTIFICIAL INTELLIGENCE, 2018, 260 : 1 - 35