Multisource Latent Feature Selective Ensemble Modeling Approach for Small-Sample High-Dimensional Process Data in Applications

被引:2
|
作者
Tang, Jian [1 ,2 ]
Zhang, Jian [3 ]
Yu, Gang [4 ]
Zhang, Wenping [5 ]
Yu, Wen [6 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China
[4] State Beijing Key Lab Proc Automat Min & Met, Beijing 102600, Peoples R China
[5] Shandong Gold Min Technol Co Ltd, Met Lab Branch, Jinan 250014, Peoples R China
[6] CINVESTAV IPN Natl Polytech Inst, Dept Control Automat, Mexico City 07360, DF, Mexico
基金
美国国家科学基金会; 北京市自然科学基金;
关键词
Feature extraction; Data models; Adaptation models; Pollution measurement; Training; Data mining; Analytical models; Multisource feature extraction; multi-layered feature selection; selective ensemble modeling; hyperparameter selection; high dimensional process data; EXTREME LEARNING-MACHINE; OPTIMIZATION; PARAMETERS; VIBRATION; SIZE;
D O I
10.1109/ACCESS.2020.3015875
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Several difficult-to-measure production qualities or environment pollution indices of industrial process must be measured using offline laboratory instruments. Soft measurement method is often used to perform online prediction of such parameters. Only small-sample modeling data with high-dimensional input features can be obtained due to the limitations and complex characteristics of the measurement device and process, respectively. Therefore, a new multisource latent feature selective ensemble (SEN) modeling approach is proposed in this study. First, input features are divided into different subgroups according to the characteristics of the modeling data. Second, the extracted multisource latent features evolve from the multi-layered selection algorithms, which are specified by feature reduction ratio, feature contribution ratio and mutual information value orderly for each subgroup. Finally, in order to construct candidate sub-models, an adaptive hyper-parameter selection algorithm based on the multi-step grid search is employed in terms of the reduced features. Sequentially, the optimized ensemble submodels with their weighting strategies are adaptively determined to build the final SEN model. The proposed method is verified by using benchmark near-infrared data, high dimensional mechanical frequency spectrum data and industrial dioxin emission concentration data.
引用
收藏
页码:148475 / 148488
页数:14
相关论文
共 50 条
  • [1] Online streaming feature selection for high-dimensional small-sample data
    Gong, Kuangfeng
    Li, Guohe
    Guo, Lingyun
    Lin, Yaojin
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2705 - 2719
  • [2] An Efficient Dimensionality Reduction Approach for Small-sample Size and High-dimensional Data Modeling
    Qiu, Xintao
    Fu, Dongmei
    Fu, Zhenduo
    JOURNAL OF COMPUTERS, 2014, 9 (03) : 576 - 580
  • [3] A Hybrid Feature Selection Algorithm Applied to High-dimensional Imbalanced Small-sample Data Classification
    Feng, Fang
    Lv, Qingquan
    Wang, Mingsong
    Yang, Xuhui
    Zhou, Qingguo
    Zhou, Rui
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 41 - 46
  • [4] Analysis of traffic accident causes based on data augmentation and ensemble learning with high-dimensional small-sample data
    Zhu, Leipeng
    Zhang, Zhiqing
    Song, Dongdong
    Chen, Biao
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [5] A High-Dimensional and Small-Sample Submersible Fault Detection Method Based on Feature Selection and Data Augmentation
    Zhao, Penghui
    Zheng, Qinghe
    Ding, Zhongjun
    Zhang, Yi
    Wang, Hongjun
    Yang, Yang
    SENSORS, 2022, 22 (01)
  • [6] Mistakes in validating the accuracy of a prediction classifier in high-dimensional but small-sample microarray data
    Lee, Sunho
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2008, 17 (06) : 635 - 642
  • [7] Variational Autoencoder-Based Dimensionality Reduction for High-Dimensional Small-Sample Data Classification
    Mahmud, Mohammad Sultan
    Huang, Joshua Zhexue
    Fu, Xianghua
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2020, 19 (01)
  • [8] A hybrid feature selection approach based on ensemble method for high-dimensional data
    Rouhi, Amirreza
    Nezamabadi-pour, Hossein
    2017 2ND CONFERENCE ON SWARM INTELLIGENCE AND EVOLUTIONARY COMPUTATION (CSIEC), 2017, : 16 - 20
  • [9] High-Dimensional, Small-Sample Product Quality Prediction Method Based on MIC-Stacking Ensemble Learning
    Yu, Jiahao
    Pan, Rongshun
    Zhao, Yongman
    APPLIED SCIENCES-BASEL, 2022, 12 (01):
  • [10] Latent Feature Group Learning for High-Dimensional Data Clustering
    Wang, Wenting
    He, Yulin
    Ma, Liheng
    Huang, Joshua Zhexue
    INFORMATION, 2019, 10 (06)