Extracting topological features to identify at-risk students using machine learning and graph convolutional network models

被引:6
|
作者
Albreiki, Balqis [1 ,2 ]
Habuza, Tetiana [1 ]
Zaki, Nazar [1 ]
机构
[1] UAE Univ, Coll Informat Technol, Dept Comp Sci & Software Engn, Al Ain, U Arab Emirates
[2] UAE Univ, Off Inst Effectiveness, Al Ain, U Arab Emirates
关键词
Student performance; Graph representation; Students at risk; Graph topological feature; Graph embedding; Graph convolutional network; CENTRALITY;
D O I
10.1186/s41239-023-00389-3
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Technological advances have significantly affected education, leading to the creation of online learning platforms such as virtual learning environments and massive open online courses. While these platforms offer a variety of features, none of them incorporates a module that accurately predicts students' academic performance and commitment. Consequently, it is crucial to design machine learning (ML) methods that predict student performance and identify at-risk students as early as possible. Graph representations of student data provide new insights into this area. This paper describes a simple but highly accurate technique for converting tabulated data into graphs. We employ distance measures (Euclidean and cosine) to calculate the similarities between students' data and construct a graph. We extract graph topological features (GF) to enhance our data. This allows us to capture structural correlations among the data and gain deeper insights than isolated data analysis. The initial dataset (DS) and GF can be used alone or jointly to improve the predictive power of the ML method. The proposed method is tested on an educational dataset and returns superior results. The use of DS alone is compared with the use of DS + GF in the classification of students into three classes: "failed","at risk", and "good". The area under the receiver operating characteristic curve (AUC) reaches 0.948 using DS, compared with 0.964 for DS + GF. The accuracy in the case of DS + GF varies from 84.5 to 87.3%. Adding GF improves the performance by 2.019% in terms of AUC and 3.261% in terms of accuracy. Moreover, by incorporating graph topological features through a graph convolutional network (GCN), the prediction performance can be enhanced by 0.5% in terms of accuracy and 0.9% in terms of AUC under the cosine distance matrix. With the Euclidean distance matrix, adding the GCN improves the prediction accuracy by 3.7% and the AUC by 2.4%. By adding graph embedding features to ML models, at-risk students can be identified with 87.4% accuracy and 0.97 AUC. The proposed solution provides a tool for the early detection of at-risk students. This will benefit universities and enhance their prediction performance, improving both effectiveness and reputation.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] Bank Statements to Network Features: Extracting Features Out of Time Series Using Visibility Graph
    Shaji, Nirbhaya
    Gama, Joao
    Ribeiro, Rita P.
    Gomes, Pedro
    ADVANCES IN INTELLIGENT DATA ANALYSIS XX, IDA 2022, 2022, 13205 : 278 - 289
  • [32] Predicting at-risk university students in a virtual learning environment via a machine learning algorithm
    Chui, Kwok Tai
    Fung, Dennis Chun Lok
    Lytras, Miltiadis D.
    Lam, Tin Miu
    COMPUTERS IN HUMAN BEHAVIOR, 2020, 107 (107)
  • [33] Comparing 'fair' machine learning models for detecting at-risk online gamblers
    Murch, W. Spencer
    Kairouz, Sylvia
    French, Martin
    INTERNATIONAL GAMBLING STUDIES, 2024,
  • [34] Classification of Diabetic Retinopathy Disease Levels by Extracting Topological Features Using Graph Neural Networks
    Sundar, Sumod
    Sumathy, S.
    IEEE ACCESS, 2023, 11 : 51435 - 51444
  • [35] Using machine learning to identify predictors of imminent drinking and create tailored messages for at-risk drinkers experiencing homelessness
    Walters, Scott T.
    Businelle, Michael S.
    Suchting, Robert
    Li, Xiaoyin
    Hebert, Emily T.
    Mun, Eun-Young
    JOURNAL OF SUBSTANCE ABUSE TREATMENT, 2021, 127
  • [36] A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes
    Lakkaraju, Himabindu
    Aguiar, Everaldo
    Shan, Carl
    Miller, David
    Bhanpuri, Nasir
    Ghani, Rayid
    Addison, Kecia L.
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1909 - 1918
  • [37] Predicting At-Risk Students Using the Deep Learning BLSTM Approach
    Souai, Wiem
    Mihoub, Alaeddine
    Tarhouni, Mounira
    Zidi, Salah
    Krichen, Moez
    Mahfoudhi, Sami
    2022 2ND INTERNATIONAL CONFERENCE OF SMART SYSTEMS AND EMERGING TECHNOLOGIES (SMARTTECH 2022), 2022, : 32 - 37
  • [38] Identifying At-Risk Students for Early Intervention-A Probabilistic Machine Learning Approach
    Nimy, Eli
    Mosia, Moeketsi
    Chibaya, Colin
    APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [39] Combined brain network topological metrics with machine learning algorithms to identify essential tremor
    Li, Qin
    Tao, Li
    Xiao, Pan
    Gui, Honge
    Xu, Bintao
    Zhang, Xueyan
    Zhang, Xiaoyu
    Chen, Huiyue
    Wang, Hansheng
    He, Wanlin
    Lv, Fajin
    Cheng, Oumei
    Luo, Jing
    Man, Yun
    Xiao, Zheng
    Fang, Weidong
    FRONTIERS IN NEUROSCIENCE, 2022, 16
  • [40] Identify Spammers in Rating Systems Using Multi-layer Graph Convolutional Network
    Huang, Jia-Tao
    Sun, Hong-Liang
    Cao, Jie
    Yi, Lan
    PROCEEDINGS OF 2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS AND SPECIAL SESSIONS: (WI-IAT WORKSHOP/SPECIAL SESSION 2021), 2021, : 340 - 346