Research on the Application of Random Forest-based Feature Selection Algorithm in Data Mining Experiments

被引:0
|
作者
Wang, Huan [1 ]
机构
[1] Southwest Forestry Univ, Coll Big Data & Intelligence Engn, Kunming 650224, Yunnan, Peoples R China
关键词
-Random forest; SVM; machine learning; big data; feature selection; best-first search; rough set theory;
D O I
10.14569/IJACSA.2023.0141054
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
high-dimensional big data presents substantial challenges for Machine Learning (ML) algorithms, mainly due to the curse of dimensionality that leads to computational inefficiencies and increased risk of overfitting. Various dimensionality reduction and Feature Selection (FS) techniques have been developed to alleviate these challenges. Random Forest (RF), a widely-used Ensemble Learning Method (ELM), is recognized for its high accuracy and robustness, including its lesser-known capability for effective FS. While specialized RF models are designed for FS, they often struggle with computational efficiency on large datasets. Addressing these challenges, this study proposes a novel Feature Selection Model (FSM) integrated with data reduction techniques, termed Dynamic Correlated Regularized Random Forest (DCRRF). The architecture operates in four phases: Preprocessing, Feature Reduction (FR) using Best-First Search with Rough Set Theory (BFS-RST), FS through DCRRF, and feature efficacy assessment using a Support Vector Machine (SVM) classifier. Benchmarked against four gene expression datasets, the proposed model outperforms existing RF-based methods in computational efficiency and classification accuracy. This study introduces a robust and efficient approach to feature selection in high-dimensional big-data scenarios.
引用
收藏
页码:505 / 518
页数:14
相关论文
共 50 条
  • [21] Research of Medical High-dimensional Imbalanced Data Classification-Ensemble Feature Selection Algorithm with Random Forest
    Zhu, Min
    Su, Bo
    Ning, Gangmin
    2017 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA), 2017, : 273 - 277
  • [22] Feature Selection Based on Random Forest and Application in Correlation Analysis of Symptom and Disease
    Hu Xue-qin
    Cui Meng
    Chen Bing
    2009 IEEE INTERNATIONAL SYMPOSIUM ON IT IN MEDICINE & EDUCATION, VOLS 1 AND 2, PROCEEDINGS, 2009, : 120 - +
  • [23] Random Forest-based Algorithm for Sleep Spindle Detection in Infant EEG
    Wei, Lan
    Ventura, Soraia
    Lowery, Madeleine
    Ryan, Mary Anne
    Mathieson, Sean
    Boylan, Geraldine B.
    Mooney, Catherine
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 58 - 61
  • [24] Data Mining Approach to Cataclysmic Variables Candidates Based on Random Forest Algorithm
    Jiang Bin
    Luo A-li
    Zhao Yong-heng
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2012, 32 (02) : 510 - 513
  • [25] A Study of Accounting Teaching Feature Selection and Importance Assessment Based on Random Forest Algorithm
    Hu, Jing
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [26] Research on the application of data mining algorithm based on decision tree
    Song, Liangong
    Metallurgical and Mining Industry, 2015, 7 (09): : 843 - 848
  • [27] Random Forest-Based Coal Mine Roof Displacement Prediction and Application
    Li, Hongxia
    Wu, Rong
    Gao, Jianan
    ADVANCES IN CIVIL ENGINEERING, 2025, 2025 (01)
  • [28] Research on feature selection for AC contactor vibration signals based on regularized random forest with recursive selection
    Liu, Shuxin
    Qi, Xinzhi
    Xing, Chaojian
    Ming, Xin
    Lv, Xianfeng
    PLOS ONE, 2024, 19 (09):
  • [29] Random Forest-Based Manifold Learning for Classification of Imaging Data in Dementia
    Gray, Katherine R.
    Aljabar, Paul
    Heckemann, Rolf A.
    Hammers, Alexander
    Rueckert, Daniel
    MACHINE LEARNING IN MEDICAL IMAGING, 2011, 7009 : 159 - +
  • [30] Multi-Class Classification of Agricultural Data Based on Random Forest and Feature Selection
    Shi, Lei
    Qin, Yaqian
    Zhang, Juanjuan
    Wang, Yan
    Qiao, Hongbo
    Si, Haiping
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2022, 15 (01)