Weighted Deep Stochastic Configuration Networks Based on M-estimator Functions

被引:0
|
作者
Ding S.-F. [1 ,2 ]
Zhang C.-L. [1 ]
Guo L.-L. [1 ,2 ]
Zhang J. [1 ,2 ]
Ding L. [3 ]
机构
[1] School of Computer Science and Technology, China University of Mining and Technology, Jiangsu, Xuzhou
[2] Mine Digitization Engineering Research Center of Ministry of Education, China University of Mining and Technology), Jiangsu, Xuzhou
[3] College of Intelligence and Computing, Tianjin University, Tianjin
来源
Jisuanji Xuebao/Chinese Journal of Computers | 2023年 / 46卷 / 11期
基金
中国国家自然科学基金;
关键词
deep stochastic configuration network; noisy data; random neural network; regression; robustness;
D O I
10.11897/SP.J.1016.2023.02476
中图分类号
学科分类号
摘要
Deep stochastic configuration network (DSCN) is an randomized incremental learning model, it can start from a small structure, increase the nodes and hidden layers gradually. As the input weights and biases of nodes are assigned according to supervisory mechanism, meantime, all the nodes in hidden layer are fully connected to the outputs, the output weights of DSCN are determined through the least square method. Therefore, DSCN has the advantages of less manual intervention, high learning efficiency, strong generalization ability. However, although the randomized feedforward learning process of DSCN has faster efficiency, the feature learning ability is still insufficient. In the meantime, with the increase of nodes and hidden layers, it is easy to lead to overfitting phenomenon. When solving regression problems with noise, the performance of original DSCN is easily affected by outliers, which reduces the generalization ability of the model. Therefore, to improve the regression performance and robustness of DSCN, weighted deep stochastic configuration networks (WDSCN) based on M-Estimator functions are proposed. First of all, we adopt two common M-estimator functions (i.e., Huber and Bisquare) to acquire the sample weights for reducing the negative impact of outliers. When the sample has a smaller training error, give this sample a larger weight, while when the training error of sample is larger, it is determined to be outlier data and give this sample a smaller weight. The sample weight decreases monotonically with the increase of the absolute value of the error, thus reducing the influence of noisy data onto the model and improving the generalization of the algorithm. Meanwhile, the weighted least square method and L2 regularization strategy are introduced to calculate output weight vector replace the least square method. It can not only solve the noisy data regression problems and avoid over-fitting problem of DSCN. In the second place, the model based on L1 regularization is helpful to extract sparse features and improve the accuracy of supervised learning, for further improve the representation ability of WDSCN, a stochastic configuration sparse autoencoder (SC-SAE) is designed, SC-SAE use the supervision mechanism of DSCN to assign input parameters, at the same time, we adopt the L1 regularization technique to objective function for getting sparse features, alternating direction method of multipliers (ADMM) approach is utilized to solve the objective function for determining the output weights of SC-SAE. And then, as the randomness encoding process of SC-SAE, we can obtain the diversity of features of different SC-SAE models, consequently effective feature representation can be acquired through fusion features from multiple SC-SAE for the training of WDSCN. Finally, experimental results on real-world datasets show that the proposed WDSCN-Huber and WDSCN-Bisquare have higher generalization performances and regression accuracies than DSCN, SCN, and other weighted models (e.g., RSC-KDE, RSC-Huber, RSC-IQR, RDSCN-KDE, WBLS-KDE and RBLS-Huber). But in the meantime, the results of ablation experiment show that WDSCN with fusion sparse features which exacted from multiple different SC-SAE models are superior to those models with fusion sparse feature. Therefore, it is verified that SC-SAE can extract effective sparse features and improve the learning ability of weighted models. © 2023 Science Press. All rights reserved.
引用
收藏
页码:2476 / 2487
页数:11
相关论文
共 38 条
  • [1] Pao Y H, Takefuji Y., Functional-link net computing: Theory, system architecture, and functionalities, Computer, 25, 5, pp. 76-79, (1992)
  • [2] Scardapane S, Wang D H., Randomness in neural networks: An overview, WIREs Data Mining and Knowledge Discovery, 7, 2, (2017)
  • [3] Wang D H., Editorial: Randomized algorithms for training neural networks, Information Sciences, 364, pp. 126-128, (2016)
  • [4] Tyukin I, Prokhorov D., Feasibility of random basis function approximators for modeling and control, IEEE International Conference on Control Applications/International Symposium on Intelligent Control, pp. 1391-1396, (2009)
  • [5] Gorban A, Tyukin I, Prokhorov D, Et al., Approximation with random bases: Pro et contra, Information Sciences, 364, pp. 129-145, (2016)
  • [6] Li M, Wang D H., Insight s into randomized algorithms for neural networks: Practical issues and common pitfalls, Information Sciences, 382, pp. 170-178, (2017)
  • [7] Wang D H, Li M., Stochastic configuration networks: Fundamentals and algorithms, IEEE Transactions on Cybernetics, 47, 10, pp. 3466-3479, (2017)
  • [8] Li M, Wang D H., 2-D stochastic configuration networks for image data analytics, IEEE Transactions on Cybernetics, 51, 1, pp. 359-372, (2021)
  • [9] Dai W, Li D P, Zhou P, Et al., Stochastic configuration networks with block increments for data modeling in process industries, Information Sciences, 484, pp. 367-386, (2019)
  • [10] Zhu X L, Feng X C, Wang W W, Et al., A further study on the inequality constraints in stochastic configuration networks, Information Sciences, 487, pp. 77-83, (2019)