Extracting topological features to identify at-risk students using machine learning and graph convolutional network models

被引:6
|
作者
Albreiki, Balqis [1 ,2 ]
Habuza, Tetiana [1 ]
Zaki, Nazar [1 ]
机构
[1] UAE Univ, Coll Informat Technol, Dept Comp Sci & Software Engn, Al Ain, U Arab Emirates
[2] UAE Univ, Off Inst Effectiveness, Al Ain, U Arab Emirates
关键词
Student performance; Graph representation; Students at risk; Graph topological feature; Graph embedding; Graph convolutional network; CENTRALITY;
D O I
10.1186/s41239-023-00389-3
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Technological advances have significantly affected education, leading to the creation of online learning platforms such as virtual learning environments and massive open online courses. While these platforms offer a variety of features, none of them incorporates a module that accurately predicts students' academic performance and commitment. Consequently, it is crucial to design machine learning (ML) methods that predict student performance and identify at-risk students as early as possible. Graph representations of student data provide new insights into this area. This paper describes a simple but highly accurate technique for converting tabulated data into graphs. We employ distance measures (Euclidean and cosine) to calculate the similarities between students' data and construct a graph. We extract graph topological features (GF) to enhance our data. This allows us to capture structural correlations among the data and gain deeper insights than isolated data analysis. The initial dataset (DS) and GF can be used alone or jointly to improve the predictive power of the ML method. The proposed method is tested on an educational dataset and returns superior results. The use of DS alone is compared with the use of DS + GF in the classification of students into three classes: "failed","at risk", and "good". The area under the receiver operating characteristic curve (AUC) reaches 0.948 using DS, compared with 0.964 for DS + GF. The accuracy in the case of DS + GF varies from 84.5 to 87.3%. Adding GF improves the performance by 2.019% in terms of AUC and 3.261% in terms of accuracy. Moreover, by incorporating graph topological features through a graph convolutional network (GCN), the prediction performance can be enhanced by 0.5% in terms of accuracy and 0.9% in terms of AUC under the cosine distance matrix. With the Euclidean distance matrix, adding the GCN improves the prediction accuracy by 3.7% and the AUC by 2.4%. By adding graph embedding features to ML models, at-risk students can be identified with 87.4% accuracy and 0.97 AUC. The proposed solution provides a tool for the early detection of at-risk students. This will benefit universities and enhance their prediction performance, improving both effectiveness and reputation.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Extracting topological features to identify at-risk students using machine learning and graph convolutional network models
    Balqis Albreiki
    Tetiana Habuza
    Nazar Zaki
    International Journal of Educational Technology in Higher Education, 20
  • [2] Using machine learning to identify the most at-risk students in physics classes
    Yang, Jie
    DeVore, Seth
    Hewagallage, Dona
    Miller, Paul
    Ryan, Qing X.
    Stewart, John
    PHYSICAL REVIEW PHYSICS EDUCATION RESEARCH, 2020, 16 (02):
  • [3] Using Convolutional Neural Network to Recognize Learning Images for Early Warning of At-Risk Students
    Yang, Zongkai
    Yang, Juan
    Rice, Kerry
    Hung, Jui-Long
    Du, Xu
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2020, 13 (03): : 617 - 630
  • [4] Early Detection At-Risk Students using Machine Learning
    Pongpaichet, Siripen
    Jankapor, Sawarin
    Janchai, Sarun
    Tongsanit, Todsaporn
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 283 - 287
  • [5] Learning Analytics to Identify Students at-risk in MOOCs
    Srilekshmi, M.
    Sindhumol, S.
    Chatterjee, Shiffon
    Bijlani, Kamal
    2016 IEEE 8TH INTERNATIONAL CONFERENCE ON TECHNOLOGY FOR EDUCATION (T4E 2016), 2016, : 194 - 199
  • [6] Using Machine Learning to Identify At-risk Students in an Introductory Programming Course at a Two-year Public College
    Cooper, Cameron
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2022, 2 (03): : 407 - 421
  • [7] Detecting At-Risk Students With Early Interventions Using Machine Learning Techniques
    Al-Shabandar, Raghad
    Hussain, Abir Jaafar
    Liatsis, Panos
    Keight, Robert
    IEEE ACCESS, 2019, 7 : 149464 - 149478
  • [8] Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models
    Adnan, Muhammad
    Habib, Asad
    Ashraf, Jawad
    Mussadiq, Shafaq
    Raza, Arsalan Ali
    Abid, Muhammad
    Bashir, Maryam
    Khan, Sana Ullah
    IEEE ACCESS, 2021, 9 : 7519 - 7539
  • [9] Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models
    Adnan, Muhammad
    Habib, Asad
    Ashraf, Jawad
    Mussadiq, Shafaq
    Raza, Arsalan Ali
    Abid, Muhammad
    Bashir, Maryam
    Khan, Sana Ullah
    IEEE Access, 2021, 9 : 7519 - 7539
  • [10] Analysing University at-Risk Students in a Virtual Learning Environment using Machine Learning Algorithms
    Naidoo, Deshalin
    Adeliyi, Timothy T.
    2023 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY, ICTAS, 2023, : 113 - 119