Research on the Application of Random Forest-based Feature Selection Algorithm in Data Mining Experiments

被引:0
|
作者
Wang, Huan [1 ]
机构
[1] Southwest Forestry Univ, Coll Big Data & Intelligence Engn, Kunming 650224, Yunnan, Peoples R China
关键词
-Random forest; SVM; machine learning; big data; feature selection; best-first search; rough set theory;
D O I
10.14569/IJACSA.2023.0141054
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
high-dimensional big data presents substantial challenges for Machine Learning (ML) algorithms, mainly due to the curse of dimensionality that leads to computational inefficiencies and increased risk of overfitting. Various dimensionality reduction and Feature Selection (FS) techniques have been developed to alleviate these challenges. Random Forest (RF), a widely-used Ensemble Learning Method (ELM), is recognized for its high accuracy and robustness, including its lesser-known capability for effective FS. While specialized RF models are designed for FS, they often struggle with computational efficiency on large datasets. Addressing these challenges, this study proposes a novel Feature Selection Model (FSM) integrated with data reduction techniques, termed Dynamic Correlated Regularized Random Forest (DCRRF). The architecture operates in four phases: Preprocessing, Feature Reduction (FR) using Best-First Search with Rough Set Theory (BFS-RST), FS through DCRRF, and feature efficacy assessment using a Support Vector Machine (SVM) classifier. Benchmarked against four gene expression datasets, the proposed model outperforms existing RF-based methods in computational efficiency and classification accuracy. This study introduces a robust and efficient approach to feature selection in high-dimensional big-data scenarios.
引用
收藏
页码:505 / 518
页数:14
相关论文
共 50 条
  • [1] Research and performance analysis of random forest-based feature selection algorithm in sports effectiveness evaluation
    Li, Yujiao
    Mu, Yingjie
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [2] Random Forest-based feature selection for emotion recognition
    Gharsalli, Sonia
    Emile, Bruno
    Laurent, Helene
    Desquesnes, Xavier
    Vivet, Damien
    5TH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, THEORY, TOOLS AND APPLICATIONS 2015, 2015, : 268 - 272
  • [3] A review of random forest-based feature selection methods for data science education and applications
    Iranzad, Reza
    Liu, Xiao
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [4] Feature selection algorithm based on random forest
    Yao, Deng-Ju
    Yang, Jing
    Zhan, Xiao-Juan
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2014, 44 (01): : 137 - 141
  • [5] Improving Landslides Prediction: Meteorological Data Preprocessing Using Random Forest-Based Feature Selection
    Guerrero Rodriguez, Byron
    Salvador Meneses, Jaime
    Garcia-Rodriguez, Jose
    16TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING MODELS IN INDUSTRIAL AND ENVIRONMENTAL APPLICATIONS (SOCO 2021), 2022, 1401 : 379 - 387
  • [6] Random forest-based feature selection and detection method for drunk driving recognition
    Li, ZhenLong
    Wang, HaoXin
    Zhang, YaoWei
    Zhao, XiaoHua
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2020, 16 (02)
  • [7] Research on Feature Selection Methods based on Random Forest
    Wang, Zhuo
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2023, 30 (02): : 623 - 633
  • [8] A Model-Free Feature Selection Technique of Feature Screening and Random Forest-Based Recursive Feature Elimination
    Xia, Siwei
    Yang, Yuehan
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2023, 2023
  • [9] Application of Random Forest Data Mining Method to the Feature Selection for Female Sub-health State
    Wang, Li-min
    Fan, Min
    Chen, Jia-xu
    Zhao, Xin
    Cui, Hua-ting
    Qou, Mei-jing
    Wang, Shao-xian
    Li, Xiao-hong
    Jiang, You-ming
    Zhou, Li-qian
    Peng, Xin
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2010, : 651 - 654
  • [10] Feature Importance Ranking of Random Forest-Based End-to-End Learning Algorithm
    Yuan, Xiaoguang
    Liu, Shiruo
    Feng, Wei
    Dauphin, Gabriel
    REMOTE SENSING, 2023, 15 (21)