Extracting topological features to identify at-risk students using machine learning and graph convolutional network models

被引:6
|
作者
Albreiki, Balqis [1 ,2 ]
Habuza, Tetiana [1 ]
Zaki, Nazar [1 ]
机构
[1] UAE Univ, Coll Informat Technol, Dept Comp Sci & Software Engn, Al Ain, U Arab Emirates
[2] UAE Univ, Off Inst Effectiveness, Al Ain, U Arab Emirates
关键词
Student performance; Graph representation; Students at risk; Graph topological feature; Graph embedding; Graph convolutional network; CENTRALITY;
D O I
10.1186/s41239-023-00389-3
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Technological advances have significantly affected education, leading to the creation of online learning platforms such as virtual learning environments and massive open online courses. While these platforms offer a variety of features, none of them incorporates a module that accurately predicts students' academic performance and commitment. Consequently, it is crucial to design machine learning (ML) methods that predict student performance and identify at-risk students as early as possible. Graph representations of student data provide new insights into this area. This paper describes a simple but highly accurate technique for converting tabulated data into graphs. We employ distance measures (Euclidean and cosine) to calculate the similarities between students' data and construct a graph. We extract graph topological features (GF) to enhance our data. This allows us to capture structural correlations among the data and gain deeper insights than isolated data analysis. The initial dataset (DS) and GF can be used alone or jointly to improve the predictive power of the ML method. The proposed method is tested on an educational dataset and returns superior results. The use of DS alone is compared with the use of DS + GF in the classification of students into three classes: "failed","at risk", and "good". The area under the receiver operating characteristic curve (AUC) reaches 0.948 using DS, compared with 0.964 for DS + GF. The accuracy in the case of DS + GF varies from 84.5 to 87.3%. Adding GF improves the performance by 2.019% in terms of AUC and 3.261% in terms of accuracy. Moreover, by incorporating graph topological features through a graph convolutional network (GCN), the prediction performance can be enhanced by 0.5% in terms of accuracy and 0.9% in terms of AUC under the cosine distance matrix. With the Euclidean distance matrix, adding the GCN improves the prediction accuracy by 3.7% and the AUC by 2.4%. By adding graph embedding features to ML models, at-risk students can be identified with 87.4% accuracy and 0.97 AUC. The proposed solution provides a tool for the early detection of at-risk students. This will benefit universities and enhance their prediction performance, improving both effectiveness and reputation.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Extracting features from infrared images using convolutional neural networks and transfer learning
    Gao, Zongjiang
    Zhang, Yingjun
    Li, Yuankui
    INFRARED PHYSICS & TECHNOLOGY, 2020, 105
  • [42] ADVERSARIAL MACHINE LEARNING USING CONVOLUTIONAL NEURAL NETWORK WITH IMAGENET
    Khakurel, Utsab
    Rawat, Danda B.
    PROCEEDINGS OF THE 2022 ANNUAL MODELING AND SIMULATION CONFERENCE (ANNSIM'22), 2022, : 246 - 257
  • [43] Predicting Dementia Risk Using Paralinguistic and Memory Test Features with Machine Learning Models
    You, Yilun
    Ahmed, Beena
    Barr, Polly
    Ballard, Kirrie
    Valenzuela, Michael
    2019 IEEE HEALTHCARE INNOVATIONS AND POINT OF CARE TECHNOLOGIES (HI-POCT), 2019, : 56 - 59
  • [44] Graph Convolutional Network-Based Interpretable Machine Learning Scheme in Smart Grids
    Luo, Yonghong
    Lu, Chao
    Zhu, Lipeng
    Song, Jie
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, 20 (01) : 47 - 58
  • [45] iAnt: Combination of Convolutional Neural Network and Random Forest Models Using PSSM and BERT Features to Identify Antioxidant Proteins
    Tran, Hoang, V
    Nguyen, Quang H.
    CURRENT BIOINFORMATICS, 2022, 17 (02) : 184 - 195
  • [46] Classification of Microcalcification Clusters Using Bilateral Features Based on Graph Convolutional Network
    Zhang, Yaqin
    Han, Jiayue
    Chen, Binghui
    Chang, Lin
    Song, Ting
    Cai, Guanxiong
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [47] Predictive Models as Early Warning Systems: A Bayesian Classification Model to Identify At-Risk Students of Programming
    Veerasamy, Ashok Kumar
    Laakso, Mikko-Jussi
    D'Souza, Daryl
    Salakoski, Tapio
    INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 174 - 195
  • [48] Generating models to predict at-risk students in technical e-learning courses
    Queiroga, Emanuel Marques
    Cechinel, Cristian
    Araujo, Ricardo Matsumura
    Bretanha, Guilherme da Costa
    2016 XI LATIN AMERICAN CONFERENCE ON LEARNING OBJECTS AND TECHNOLOGY (LACLO), 2016, : 196 - 203
  • [49] Assessment of Femoral Cartilage Morphological and Topological Features Using Machine Learning
    Gunnarsson, Arnar Evgeni
    Ciliberti, Federica Kiyomi
    Belfiori, Chiara
    Lindemann, Alessia
    Forni, Riccardo
    Jonsson, Halldor, Jr.
    Gargiulo, Paolo
    2022 IEEE INTERNATIONAL CONFERENCE ON METROLOGY FOR EXTENDED REALITY, ARTIFICIAL INTELLIGENCE AND NEURAL ENGINEERING (METROXRAINE), 2022, : 277 - 282
  • [50] Machine learning and network-based models to identify genetic risk factors to the progression and survival of colorectal cancer
    Hossain, Md Jakir
    Chowdhury, Utpala Nanda
    Islam, M. Babul
    Uddin, Shahadat
    Ahmed, Mohammad Boshir
    Quinn, Julian M. W.
    Moni, Mohammad Ali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 135