Extracting topological features to identify at-risk students using machine learning and graph convolutional network models

被引:6
|
作者
Albreiki, Balqis [1 ,2 ]
Habuza, Tetiana [1 ]
Zaki, Nazar [1 ]
机构
[1] UAE Univ, Coll Informat Technol, Dept Comp Sci & Software Engn, Al Ain, U Arab Emirates
[2] UAE Univ, Off Inst Effectiveness, Al Ain, U Arab Emirates
关键词
Student performance; Graph representation; Students at risk; Graph topological feature; Graph embedding; Graph convolutional network; CENTRALITY;
D O I
10.1186/s41239-023-00389-3
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Technological advances have significantly affected education, leading to the creation of online learning platforms such as virtual learning environments and massive open online courses. While these platforms offer a variety of features, none of them incorporates a module that accurately predicts students' academic performance and commitment. Consequently, it is crucial to design machine learning (ML) methods that predict student performance and identify at-risk students as early as possible. Graph representations of student data provide new insights into this area. This paper describes a simple but highly accurate technique for converting tabulated data into graphs. We employ distance measures (Euclidean and cosine) to calculate the similarities between students' data and construct a graph. We extract graph topological features (GF) to enhance our data. This allows us to capture structural correlations among the data and gain deeper insights than isolated data analysis. The initial dataset (DS) and GF can be used alone or jointly to improve the predictive power of the ML method. The proposed method is tested on an educational dataset and returns superior results. The use of DS alone is compared with the use of DS + GF in the classification of students into three classes: "failed","at risk", and "good". The area under the receiver operating characteristic curve (AUC) reaches 0.948 using DS, compared with 0.964 for DS + GF. The accuracy in the case of DS + GF varies from 84.5 to 87.3%. Adding GF improves the performance by 2.019% in terms of AUC and 3.261% in terms of accuracy. Moreover, by incorporating graph topological features through a graph convolutional network (GCN), the prediction performance can be enhanced by 0.5% in terms of accuracy and 0.9% in terms of AUC under the cosine distance matrix. With the Euclidean distance matrix, adding the GCN improves the prediction accuracy by 3.7% and the AUC by 2.4%. By adding graph embedding features to ML models, at-risk students can be identified with 87.4% accuracy and 0.97 AUC. The proposed solution provides a tool for the early detection of at-risk students. This will benefit universities and enhance their prediction performance, improving both effectiveness and reputation.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Testing Machine Learning Models to Identify Computer Science Students at High-risk of Probation
    Barkam, Hamza Errahmouni
    Wang, Max
    Neda, Barbara Martinez
    Gago-Masague, Sergio
    PROCEEDINGS OF THE 53RD ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION (SIGCSE 2022), VOL 2, 2022, : 1161 - 1161
  • [22] Using graph embedding and machine learning to identify rebels on twitter
    Masood, Muhammad Ali
    Abbasi, Rabeeh Ayaz
    JOURNAL OF INFORMETRICS, 2021, 15 (01)
  • [23] Fraud risk assessment in car insurance using claims graph features in machine learning
    Vorobyev, Ivan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [24] Detecting at-risk mental states for psychosis (ARMS) using machine learning ensembles and facial features
    Loch, Alexandre Andrade
    Gondim, Joao Medrado
    Argolo, Felipe Coelho
    Lopes-Rocha, Ana Caroline
    Andrade, Julio Cesar
    van de Bilt, Martinus Theodorus
    de Jesus, Leonardo Peroni
    Haddad, Natalia Mansur
    Cecchi, Guillermo A.
    Mota, Natalia Bezerra
    Gattaz, Wagner Farid
    Corcoran, Cheryl Mary
    Ara, Anderson
    SCHIZOPHRENIA RESEARCH, 2023, 258 : 45 - 52
  • [25] Using Predicted Academic Performance to Identify At-Risk Students in Public Schools
    Fazlul, Ishtiaque
    Koedel, Cory
    Parsons, Eric
    EDUCATIONAL EVALUATION AND POLICY ANALYSIS, 2024,
  • [26] Apply Machine Learning Algorithms to Predict At-Risk Students to Admission Period
    Embarak, Ossama
    2020 SEVENTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY TRENDS (ITT 2020), 2020, : 190 - 195
  • [27] A Conceptual Predictive Analytics Model for the Identification of at-risk students in VLE using Machine Learning Techniques
    Shafiq, Dalia Abdulkareem
    Marjani, Mohsen
    Habeeb, Riyaz Ahamed Ariyaluran
    Asirvatham, David
    2022 14TH INTERNATIONAL CONFERENCE ON MATHEMATICS, ACTUARIAL SCIENCE, COMPUTER SCIENCE AND STATISTICS (MACS), 2022,
  • [28] Extracting Optimal Number of Features for Machine Learning Models in Multilayer IoT Attacks
    Al Sukhni, Badeea
    Manna, Soumya K.
    Dave, Jugal M.
    Zhang, Leishi
    SENSORS, 2024, 24 (24)
  • [29] Using dynamic graph convolutional network to identify individuals with major depression disorder
    Zhou, Ni
    Yuan, Ze
    Zhou, Hongying
    Lyu, Dongbin
    Wang, Fan
    Wang, Meiti
    Lu, Zhongjiao
    Huang, Qinte
    Chen, Yiming
    Huang, Haijing
    Cao, Tongdan
    Wu, Chenglin
    Yang, Weichieh
    Hong, Wu
    JOURNAL OF AFFECTIVE DISORDERS, 2025, 371 : 188 - 195
  • [30] Machine Learning: Research on Detection of Network Security Vulnerabilities by Extracting and Matching Features
    Xue Y.
    Journal of Cyber Security and Mobility, 2023, 12 (05): : 697 - 710