Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks

被引:0
|
作者
Nguyen, Quynh [1 ]
Mondelli, Marco [2 ]
Montufar, Guido [1 ,3 ]
机构
[1] MPI MIS, Leipzig, Germany
[2] IST Austria, Klosterneuburg, Austria
[3] Univ Calif Los Angeles, Los Angeles, CA 90024 USA
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A recent line of work has analyzed the theoretical properties of deep neural networks via the Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue of the NTK has been related to the memorization capacity, the global convergence of gradient descent algorithms and the generalization of deep nets. However, existing results either provide bounds in the two-layer setting or assume that the spectrum of the NTK matrices is bounded away from 0 for multi-layer networks. In this paper, we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU nets, both in the limiting case of infinite widths and for finite widths. In the finite-width setting, the network architectures we consider are fairly general: we require the existence of a wide layer with roughly order of N neurons, N being the number of data samples; and the scaling of the remaining layer widths is arbitrary (up to logarithmic factors). To obtain our results, we analyze various quantities of independent interest: we give lower bounds on the smallest singular value of hidden feature matrices, and upper bounds on the Lipschitz constant of input-output feature maps.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] New Error Bounds for Deep ReLU Networks Using Sparse Grids
    Montanelli, Hadrien
    Du, Qiang
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01): : 78 - 92
  • [22] Trajectory growth lower bounds for random sparse deep ReLU networks
    Price, Ilan
    Tanner, Jared
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 1004 - 1009
  • [23] Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
    Huang, Jiaoyang
    Yau, Horng-Tzer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [24] Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)
    Jacot, Arthur
    Gabriel, Franck
    Hongler, Clement
    STOC '21: PROCEEDINGS OF THE 53RD ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2021, : 6 - 6
  • [25] Robust nonparametric regression based on deep ReLU neural networks
    Chen, Juntong
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2024, 233
  • [26] On Centralization and Unitization of Batch Normalization for Deep ReLU Neural Networks
    Fei, Wen
    Dai, Wenrui
    Li, Chenglin
    Zou, Junni
    Xiong, Hongkai
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 2827 - 2841
  • [27] Deep ReLU neural networks in high-dimensional approximation
    Dung, Dinh
    Nguyen, Van Kien
    NEURAL NETWORKS, 2021, 142 : 619 - 635
  • [28] ReLU deep neural networks from the hierarchical basis perspective
    He, Juncai
    Li, Lin
    Xu, Jinchao
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2022, 120 : 105 - 114
  • [29] The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine Learning
    Incudini M.
    Grossi M.
    Mandarino A.
    Vallecorsa S.
    Pierro A.D.
    Windridge D.
    IEEE Transactions on Quantum Engineering, 2023, 4
  • [30] Tight bounds on the size of neural networks for classification problems
    Beiu, V
    de Pauw, T
    BIOLOGICAL AND ARTIFICIAL COMPUTATION: FROM NEUROSCIENCE TO TECHNOLOGY, 1997, 1240 : 743 - 752