On the Provable Generalization of Recurrent Neural Networks

被引：0

作者：

Wang, Lifu ^{[1
]}

Shen, Bo ^{[1
]}

Hu, Bo ^{[1
]}

Cao, Xing ^{[1
]}

机构：

[1] Beijing Jiaotong Univ, Beijing, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works: (1) For a RNN with input sequence x = (X-1, X-2,..., X-L), previous works study to learn functions that are summation of f(ss(T)(l) X-l) and require normalized conditions that ||X-l|| <= epsilon with some very small. depending on the complexity of f. In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and show that some notable concept classes are learnable with the numbers of iterations and samples scaling almost-polynomially in the input length L. (2) Moreover, we prove a novel result to learn N-variables functions of input sequence with the form f(ss(T) [X-l1,..., X-lN]), which do not belong to the "additive" concept class, i,e., the summation of function f(X-l). And we show that when either N or l(0) = max(l(1),.., l(N)) - min(l(1),.., l(N)) is small, f(ss(T) [X-l1,..., X-lN]) will be learnable with the number iterations and samples scaling almost-polynomially in the input length L.

引用

页数：12

共 50 条

[31] Post-training Quantization for Neural Networks with Provable Guarantees*
Zhang, Jinjie
Zhou, Yixuan
Saab, Rayan
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2023, 5 (02): : 373 - 399
[32] Generalization by symbolic abstraction in cascaded recurrent networks
Bodén, M
NEUROCOMPUTING, 2004, 57 : 87 - 104
[33] Wavelet Neural Networks Generalization Improvement
Skhiri, Mohamed Zine El Abidine
Chtourou, Mohamed
2013 10TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2013,
[34] Generalization in neural networks: A broad survey
Rohlfs, Chris
NEUROCOMPUTING, 2025, 611
[35] Strong generalization in quantum neural networks
Jinzhe Jiang
Yaqian Zhao
Rengang Li
Chen Li
Zhenhua Guo
Baoyu Fan
Xuelei Li
Ruyang Li
Xin Zhang
Quantum Information Processing, 22
[36] GENERALIZATION PROPERTIES OF MULTILAYERED NEURAL NETWORKS
MATO, G
PARGA, N
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1992, 25 (19): : 5047 - 5054
[37] CATASTROPHIC INTERFERENCE AND GENERALIZATION IN NEURAL NETWORKS
LEWANDOWSKY, S
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 653 - 653
[38] Strong generalization in quantum neural networks
Jiang, Jinzhe
Zhao, Yaqian
Li, Rengang
Li, Chen
Guo, Zhenhua
Fan, Baoyu
Li, Xuelei
Li, Ruyang
Zhang, Xin
QUANTUM INFORMATION PROCESSING, 2023, 22 (12)
[39] GENERALIZATION AND SPECIALIZATION IN ARTIFICIAL NEURAL NETWORKS
HAMPSON, S
PROGRESS IN NEUROBIOLOGY, 1991, 37 (05) : 383 - 431
[40] Wavelet sampling and generalization in neural networks
Zhang, Zhiguo
Kon, Mark A.
NEUROCOMPUTING, 2017, 267 : 36 - 54

← 1 2 3 4 5 →