Predictive Analysis of Students' Learning Performance Using Data Mining Techniques: A Comparative Study of Feature Selection Methods

被引:7
|
作者
Mustapha, S. M. F. D. Syed [1 ]
机构
[1] Zayed Univ, Coll Technol Innovat, POB 144534, Dubai, U Arab Emirates
关键词
data mining; feature selection methods; Boruta algorithm; lasso regression; recursive feature elimination (RFE); random forest importance (RFI);
D O I
10.3390/asi6050086
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The utilization of data mining techniques for the prompt prediction of academic success has gained significant importance in the current era. There is an increasing interest in utilizing these methodologies to forecast the academic performance of students, thereby facilitating educators to intervene and furnish suitable assistance when required. The purpose of this study was to determine the optimal methods for feature engineering and selection in the context of regression and classification tasks. This study compared the Boruta algorithm and Lasso regression for regression, and Recursive Feature Elimination (RFE) and Random Forest Importance (RFI) for classification. According to the findings, Gradient Boost for the regression part of this study had the least Mean Absolute Error (MAE) and Root-Mean-Square Error (RMSE) of 12.93 and 18.28, respectively, in the case of the Boruta selection method. In contrast, RFI was found to be the superior classification method, yielding an accuracy rate of 78% in the classification part. This research emphasized the significance of employing appropriate feature engineering and selection methodologies to enhance the efficacy of machine learning algorithms. Using a diverse set of machine learning techniques, this study analyzed the OULA dataset, focusing on both feature engineering and selection. Our approach was to systematically compare the performance of different models, leading to insights about the most effective strategies for predicting student success.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] A comparative study of forest methods for time-to-event data: variable selection and predictive performance
    Yingxin Liu
    Shiyu Zhou
    Hongxia Wei
    Shengli An
    BMC Medical Research Methodology, 21
  • [32] A comparative study of forest methods for time-to-event data: variable selection and predictive performance
    Liu, Yingxin
    Zhou, Shiyu
    Wei, Hongxia
    An, Shengli
    BMC MEDICAL RESEARCH METHODOLOGY, 2021, 21 (01)
  • [33] Brain Neural Data Analysis Using Machine Learning Feature Selection and Classification Methods
    Bozhkov, Lachezar
    Georgieva, Petia
    Trifonov, Roumen
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS (EANN 2014), 2014, 459 : 123 - 132
  • [34] Analysis of Data Mining Techniques for Constructing a Predictive Model for Academic Performance
    Merchan, S. M.
    Duarte, J. A.
    IEEE LATIN AMERICA TRANSACTIONS, 2016, 14 (06) : 2783 - 2788
  • [35] A Comparative Study to Predict Student's Performance Using Educational Data Mining Techniques
    Khasanah, Annisa Uswatun
    Harwati
    5TH INTERNATIONAL CONFERENCE ON MANUFACTURING, OPTIMIZATION, INDUSTRIAL AND MATERIAL ENGINEERING, 2017, 215
  • [36] Analyze Students Performance of a National Exam Using Feature Selection Methods
    Hashemi, Hanieh Zehtab
    Parvasideh, Parvane
    Larijani, Zahra Hasan
    Moradi, Fatemeh
    2018 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2018, : 7 - 11
  • [37] A Comparative Study of Evolutionary Methods for Feature Selection in Sentiment Analysis
    Garg, Shikhar
    Verma, Sukriti
    IJCCI: PROCEEDINGS OF THE 11TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2019, : 131 - 138
  • [38] A Comparative Study to Evaluate Filtering Methods for Crime Data Feature Selection
    Jalil, Masita Masila Abdul
    Mohd, Fatihah
    Noor, Noor Maizura Mohamad
    DISCOVERY AND INNOVATION OF COMPUTER SCIENCE TECHNOLOGY IN ARTIFICIAL INTELLIGENCE ERA, 2017, 116 : 113 - 120
  • [39] Feature selection and validated predictive performance in the domain of Legionella pneumophila: A comparative study
    Van Der Ploeg T.
    Steyerberg E.W.
    BMC Research Notes, 9 (1)
  • [40] Comparative Analysis of Machine Learning Techniques Using Predictive Modeling
    Khandelwal, Ritu
    Goyal, Hemlata
    Shekhawat, Rajveer S.
    Recent Advances in Computer Science and Communications, 2022, 15 (03) : 466 - 477