Multisource Latent Feature Selective Ensemble Modeling Approach for Small-Sample High-Dimensional Process Data in Applications

被引:2
|
作者
Tang, Jian [1 ,2 ]
Zhang, Jian [3 ]
Yu, Gang [4 ]
Zhang, Wenping [5 ]
Yu, Wen [6 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China
[4] State Beijing Key Lab Proc Automat Min & Met, Beijing 102600, Peoples R China
[5] Shandong Gold Min Technol Co Ltd, Met Lab Branch, Jinan 250014, Peoples R China
[6] CINVESTAV IPN Natl Polytech Inst, Dept Control Automat, Mexico City 07360, DF, Mexico
基金
美国国家科学基金会; 北京市自然科学基金;
关键词
Feature extraction; Data models; Adaptation models; Pollution measurement; Training; Data mining; Analytical models; Multisource feature extraction; multi-layered feature selection; selective ensemble modeling; hyperparameter selection; high dimensional process data; EXTREME LEARNING-MACHINE; OPTIMIZATION; PARAMETERS; VIBRATION; SIZE;
D O I
10.1109/ACCESS.2020.3015875
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Several difficult-to-measure production qualities or environment pollution indices of industrial process must be measured using offline laboratory instruments. Soft measurement method is often used to perform online prediction of such parameters. Only small-sample modeling data with high-dimensional input features can be obtained due to the limitations and complex characteristics of the measurement device and process, respectively. Therefore, a new multisource latent feature selective ensemble (SEN) modeling approach is proposed in this study. First, input features are divided into different subgroups according to the characteristics of the modeling data. Second, the extracted multisource latent features evolve from the multi-layered selection algorithms, which are specified by feature reduction ratio, feature contribution ratio and mutual information value orderly for each subgroup. Finally, in order to construct candidate sub-models, an adaptive hyper-parameter selection algorithm based on the multi-step grid search is employed in terms of the reduced features. Sequentially, the optimized ensemble submodels with their weighting strategies are adaptively determined to build the final SEN model. The proposed method is verified by using benchmark near-infrared data, high dimensional mechanical frequency spectrum data and industrial dioxin emission concentration data.
引用
收藏
页码:148475 / 148488
页数:14
相关论文
共 50 条
  • [21] A Light Causal Feature Selection Approach to High-Dimensional Data
    Ling, Zhaolong
    Li, Ying
    Zhang, Yiwen
    Yu, Kui
    Zhou, Peng
    Li, Bo
    Wu, Xindong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 7639 - 7650
  • [22] Multistage feature selection approach for high-dimensional cancer data
    Alkuhlani, Alhasan
    Nassef, Mohammad
    Farag, Ibrahim
    SOFT COMPUTING, 2017, 21 (22) : 6895 - 6906
  • [23] On generalized latent factor modeling and inference for high-dimensional binomial data
    Ma, Ting Fung
    Wang, Fangfang
    Zhu, Jun
    BIOMETRICS, 2023, 79 (03) : 2311 - 2320
  • [24] Multistage feature selection approach for high-dimensional cancer data
    Alhasan Alkuhlani
    Mohammad Nassef
    Ibrahim Farag
    Soft Computing, 2017, 21 : 6895 - 6906
  • [25] PLS-based recursive feature elimination for high-dimensional small sample
    You, Wenjie
    Yang, Zijiang
    Ji, Guoli
    KNOWLEDGE-BASED SYSTEMS, 2014, 55 : 15 - 28
  • [26] Small sample sizes: A big data problem in high-dimensional data analysis
    Konietschke, Frank
    Schwab, Karima
    Pauly, Markus
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (03) : 687 - 701
  • [27] An integrated manifold learning approach for high-dimensional data feature extractions and its applications to online process monitoring of additive manufacturing
    Liu, Chenang
    Kong, Zhenyu
    Babu, Suresh
    Joslin, Chase
    Ferguson, James
    IISE TRANSACTIONS, 2021, 53 (11) : 1215 - 1230
  • [28] Ensemble decision forest of RBF networks via hybrid feature clustering approach for high-dimensional data classification
    Abpeykar, Shadi
    Ghatee, Mehdi
    Zare, Hadi
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 131 : 12 - 36
  • [29] Feature Selection in High-Dimensional Space with Applications to Gene Expression Data
    Pantha, Nishan
    Ramasubramanian, Muthukumaran
    Gurung, Iksha
    Maskey, Manil
    Sanders, Lauren M.
    Casaletto, James
    Costes, Sylvain V.
    SOUTHEASTCON 2024, 2024, : 6 - 15
  • [30] Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
    Biglari, M.
    Mirzaei, F.
    Hassanpour, H.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2020, 33 (02): : 213 - 220