Improved rates and asymptotic normality for nonparametric neural network estimators

被引:88
作者
Chen, XH [1 ]
White, H
机构
[1] Univ Chicago, Dept Econ, Chicago, IL 60637 USA
[2] Univ Calif San Diego, Dept Econ, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Inst Neural Computat, La Jolla, CA 92093 USA
基金
美国国家科学基金会;
关键词
artificial neural networks; asymptotic normality; degree of approximation; mixing processes; nonparametric estimation; semiparametric estimation; sieve estimation; statistical inference;
D O I
10.1109/18.749011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we obtain an improved approximation rate tin Sobolev norm) of r(-1/2-alpha/(d+1)) for a large class of single hidden layer feedforward artificial neural networks (ANN) with r hidden units and possibly nonsigmoid activation functions when the target function satisfies certain smoothness conditions, Here, d is the dimension of the domain of the target function, and alpha is an element of (0, 1] is related to the smoothness of the activation function. When applying this class of ANN's to nonparametrically estimate (train) a general target function using the method of sieves, we obtain new root-mean-square convergence rates of O-P([n/log (n)](-(1+2 alpha/(d+1))/[4(1+alpha(d+1))])) = O-P(n(-1/4)) by letting the number of hidden units r, increase appropriately with the sample size (number of training examples) n, These rates are valid for i.i.d. data as well as for uniform mixing and absolutely regular (beta-mixing) stationary time series data. In addition, the rates are fast enough to deliver root-n asymptotic normality for plug-in estimates of smooth functionals using general ANN sieve estimators, As interesting applications to nonlinear time series, we establish rates for ANN sieve estimators of four different multivariate target functions: a conditional mean, a conditional quantile, a joint density, and a conditional density. We also obtain root-n asymptotic normality results for semiparametric model coefficient and average derivative estimators.
引用
收藏
页码:682 / 691
页数:10
相关论文
共 23 条
[1]  
8Doukhan P., 2012, Mixing: Properties and Examples, V85
[2]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[3]   APPROXIMATION AND ESTIMATION BOUNDS FOR ARTIFICIAL NEURAL NETWORKS [J].
BARRON, AR .
MACHINE LEARNING, 1994, 14 (01) :115-133
[4]   APPROXIMATION OF DENSITY-FUNCTIONS BY SEQUENCES OF EXPONENTIAL-FAMILIES [J].
BARRON, AR ;
SHEU, CH .
ANNALS OF STATISTICS, 1991, 19 (03) :1347-1369
[5]   Sieve extremum estimates for weakly dependent data [J].
Chen, XH ;
Shen, XT .
ECONOMETRICA, 1998, 66 (02) :289-314
[6]   LIMIT-THEOREMS FOR SUMS OF WEAKLY DEPENDENT BANACH-SPACE VALUED RANDOM-VARIABLES [J].
DEHLING, H .
ZEITSCHRIFT FUR WAHRSCHEINLICHKEITSTHEORIE UND VERWANDTE GEBIETE, 1983, 63 (03) :393-432
[7]  
GABUSHIN VN, 1967, MAT ZAMETKI, V1, P291
[8]   OPTIMAL PLUG-IN ESTIMATORS FOR NONPARAMETRIC FUNCTIONAL ESTIMATION [J].
GOLDSTEIN, L ;
MESSER, K .
ANNALS OF STATISTICS, 1992, 20 (03) :1306-1328
[9]  
Grenander U., 1981, ABSTRACT INFERENCE
[10]   DEGREE OF APPROXIMATION RESULTS FOR FEEDFORWARD NETWORKS APPROXIMATING UNKNOWN MAPPINGS AND THEIR DERIVATIVES [J].
HORNIK, K ;
STINCHCOMBE, M ;
WHITE, H ;
AUER, P .
NEURAL COMPUTATION, 1994, 6 (06) :1262-1275