On the Provable Generalization of Recurrent Neural Networks

被引:0
|
作者
Wang, Lifu [1 ]
Shen, Bo [1 ]
Hu, Bo [1 ]
Cao, Xing [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works: (1) For a RNN with input sequence x = (X-1, X-2,..., X-L), previous works study to learn functions that are summation of f(ss(T)(l) X-l) and require normalized conditions that ||X-l|| <= epsilon with some very small. depending on the complexity of f. In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and show that some notable concept classes are learnable with the numbers of iterations and samples scaling almost-polynomially in the input length L. (2) Moreover, we prove a novel result to learn N-variables functions of input sequence with the form f(ss(T) [X-l1,..., X-lN]), which do not belong to the "additive" concept class, i,e., the summation of function f(X-l). And we show that when either N or l(0) = max(l(1),.., l(N)) - min(l(1),.., l(N)) is small, f(ss(T) [X-l1,..., X-lN]) will be learnable with the number iterations and samples scaling almost-polynomially in the input length L.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Post-training Quantization for Neural Networks with Provable Guarantees*
    Zhang, Jinjie
    Zhou, Yixuan
    Saab, Rayan
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2023, 5 (02): : 373 - 399
  • [32] Generalization by symbolic abstraction in cascaded recurrent networks
    Bodén, M
    NEUROCOMPUTING, 2004, 57 : 87 - 104
  • [33] Wavelet Neural Networks Generalization Improvement
    Skhiri, Mohamed Zine El Abidine
    Chtourou, Mohamed
    2013 10TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2013,
  • [34] Generalization in neural networks: A broad survey
    Rohlfs, Chris
    NEUROCOMPUTING, 2025, 611
  • [35] Strong generalization in quantum neural networks
    Jinzhe Jiang
    Yaqian Zhao
    Rengang Li
    Chen Li
    Zhenhua Guo
    Baoyu Fan
    Xuelei Li
    Ruyang Li
    Xin Zhang
    Quantum Information Processing, 22
  • [36] GENERALIZATION PROPERTIES OF MULTILAYERED NEURAL NETWORKS
    MATO, G
    PARGA, N
    JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1992, 25 (19): : 5047 - 5054
  • [37] CATASTROPHIC INTERFERENCE AND GENERALIZATION IN NEURAL NETWORKS
    LEWANDOWSKY, S
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 653 - 653
  • [38] Strong generalization in quantum neural networks
    Jiang, Jinzhe
    Zhao, Yaqian
    Li, Rengang
    Li, Chen
    Guo, Zhenhua
    Fan, Baoyu
    Li, Xuelei
    Li, Ruyang
    Zhang, Xin
    QUANTUM INFORMATION PROCESSING, 2023, 22 (12)
  • [39] GENERALIZATION AND SPECIALIZATION IN ARTIFICIAL NEURAL NETWORKS
    HAMPSON, S
    PROGRESS IN NEUROBIOLOGY, 1991, 37 (05) : 383 - 431
  • [40] Wavelet sampling and generalization in neural networks
    Zhang, Zhiguo
    Kon, Mark A.
    NEUROCOMPUTING, 2017, 267 : 36 - 54