Multisource Latent Feature Selective Ensemble Modeling Approach for Small-Sample High-Dimensional Process Data in Applications

被引:2
|
作者
Tang, Jian [1 ,2 ]
Zhang, Jian [3 ]
Yu, Gang [4 ]
Zhang, Wenping [5 ]
Yu, Wen [6 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China
[4] State Beijing Key Lab Proc Automat Min & Met, Beijing 102600, Peoples R China
[5] Shandong Gold Min Technol Co Ltd, Met Lab Branch, Jinan 250014, Peoples R China
[6] CINVESTAV IPN Natl Polytech Inst, Dept Control Automat, Mexico City 07360, DF, Mexico
基金
美国国家科学基金会; 北京市自然科学基金;
关键词
Feature extraction; Data models; Adaptation models; Pollution measurement; Training; Data mining; Analytical models; Multisource feature extraction; multi-layered feature selection; selective ensemble modeling; hyperparameter selection; high dimensional process data; EXTREME LEARNING-MACHINE; OPTIMIZATION; PARAMETERS; VIBRATION; SIZE;
D O I
10.1109/ACCESS.2020.3015875
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Several difficult-to-measure production qualities or environment pollution indices of industrial process must be measured using offline laboratory instruments. Soft measurement method is often used to perform online prediction of such parameters. Only small-sample modeling data with high-dimensional input features can be obtained due to the limitations and complex characteristics of the measurement device and process, respectively. Therefore, a new multisource latent feature selective ensemble (SEN) modeling approach is proposed in this study. First, input features are divided into different subgroups according to the characteristics of the modeling data. Second, the extracted multisource latent features evolve from the multi-layered selection algorithms, which are specified by feature reduction ratio, feature contribution ratio and mutual information value orderly for each subgroup. Finally, in order to construct candidate sub-models, an adaptive hyper-parameter selection algorithm based on the multi-step grid search is employed in terms of the reduced features. Sequentially, the optimized ensemble submodels with their weighting strategies are adaptively determined to build the final SEN model. The proposed method is verified by using benchmark near-infrared data, high dimensional mechanical frequency spectrum data and industrial dioxin emission concentration data.
引用
收藏
页码:148475 / 148488
页数:14
相关论文
共 50 条
  • [41] Canonical correlation analysis of high-dimensional data with very small sample support
    Song, Yang
    Schreier, Peter J.
    Ramirez, David
    Hasija, Tanuj
    SIGNAL PROCESSING, 2016, 128 : 449 - 458
  • [42] Analysis of Ensemble Feature Selection for Correlated High-Dimensional RNA-Seq Cancer Data
    Polewko-Klim, Aneta
    Rudnicki, Witold R.
    COMPUTATIONAL SCIENCE - ICCS 2020, PT III, 2020, 12139 : 525 - 538
  • [43] Exploiting the ensemble paradigm for stable feature selection: A case study on high-dimensional genomic data
    Pes, Barbara
    Dessi, Nicoletta
    Angioni, Marta
    INFORMATION FUSION, 2017, 35 : 132 - 147
  • [44] An Effective Approach for Predicting P-value using High-dimensional SNPs data with Small Sample Size
    Wang, Jiayu
    Nan, Fengtao
    Yang, Po
    Yang, Yun
    Qi, Jun
    20TH INT CONF ON UBIQUITOUS COMP AND COMMUNICAT (IUCC) / 20TH INT CONF ON COMP AND INFORMATION TECHNOLOGY (CIT) / 4TH INT CONF ON DATA SCIENCE AND COMPUTATIONAL INTELLIGENCE (DSCI) / 11TH INT CONF ON SMART COMPUTING, NETWORKING, AND SERV (SMARTCNS), 2021, : 339 - 344
  • [45] A novel autoencoder approach to feature extraction with linear separability for high-dimensional data
    Zheng, Jian
    Qu, Hongchun
    Li, Zhaoni
    Li, Lin
    Tang, Xiaoming
    Guo, Fei
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [46] An Information-theoretic Approach to Unsupervised Feature Selection for High-Dimensional Data
    Huang, Shao-Lun
    Zhang, Lin
    Zheng, Lizhong
    2017 IEEE INFORMATION THEORY WORKSHOP (ITW), 2017, : 434 - 438
  • [47] An information-theoretic approach to unsupervised feature selection for high-dimensional data
    Huang S.-L.
    Xu X.
    Zheng L.
    IEEE Journal on Selected Areas in Information Theory, 2020, 1 (01): : 157 - 166
  • [48] An efficient approach for feature construction of high-dimensional microarray data by random projections
    Tariq, Hassan
    Eldridge, Elf
    Welch, Ian
    PLOS ONE, 2018, 13 (04):
  • [49] A novel autoencoder approach to feature extraction with linear separability for high-dimensional data
    Zheng J.
    Qu H.
    Li Z.
    Li L.
    Tang X.
    Guo F.
    PeerJ Computer Science, 2022, 8
  • [50] RETRACTED: An Ensemble Clustering Approach (Consensus Clustering) for High-Dimensional Data (Retracted Article)
    Yan, Jingdong
    Liu, Wuwei
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022