Evaluating the Application of Large Language Models to Generate Feedback in Programming Education

被引:2
|
作者
Jacobs, Sven [1 ]
Jaschke, Steffen [1 ]
机构
[1] Univ Siegen, Comp Sci Educ, Siegen, Germany
关键词
D O I
10.1109/EDUCON60312.2024.10578838
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This study investigates the application of large language models, specifically GPT-4, to enhance programming education. The research outlines the design of a web application that uses GPT-4 to provide feedback on programming tasks, without giving away the solution. A web application for working on programming tasks was developed for the study and evaluated with 51 students over the course of one semester. The results show that most of the feedback generated by GPT-4 effectively addressed code errors. However, challenges with incorrect suggestions and hallucinated issues indicate the need for further improvements.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] The Use of Large Language Models in Education
    Xing, Wanli
    Nixon, Nia
    Crossley, Scott
    Denny, Paul
    Lan, Andrew
    Stamper, John
    Yu, Zhou
    INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION, 2025,
  • [32] Evaluating large language models for annotating proteins
    Vitale, Rosario
    Bugnon, Leandro A.
    Fenoy, Emilio Luis
    Milone, Diego H.
    Stegmayer, Georgina
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [33] A bilingual benchmark for evaluating large language models
    Alkaoud, Mohamed
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [34] SafetyBench: Evaluating the Safety of Large Language Models
    Zhang, Zhexin
    Lei, Leqi
    Wu, Lindong
    Sun, Rui
    Huang, Yongkang
    Long, Chong
    Liu, Xiao
    Lei, Xuanyu
    Tang, Jie
    Huang, Minlie
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 15537 - 15553
  • [35] Evaluating Large Language Models for Material Selection
    Grandi, Daniele
    Jain, Yash Patawari
    Groom, Allin
    Cramer, Brandon
    Mccomb, Christopher
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2025, 25 (02)
  • [36] Evaluating large language models in pediatric nephrology
    Filler, Guido
    Niel, Olivier
    PEDIATRIC NEPHROLOGY, 2025,
  • [37] Evaluating large language models as agents in the clinic
    Nikita Mehandru
    Brenda Y. Miao
    Eduardo Rodriguez Almaraz
    Madhumita Sushil
    Atul J. Butte
    Ahmed Alaa
    npj Digital Medicine, 7
  • [38] EVALUATING LARGE LANGUAGE MODELS ON THEIR ACCURACY AND COMPLETENESS
    Edalat, Camellia
    Kirupaharan, Nila
    Dalvin, Lauren A.
    Mishra, Kapil
    Marshall, Rayna
    Xu, Hannah
    Francis, Jasmine H.
    Berkenstock, Meghan
    RETINA-THE JOURNAL OF RETINAL AND VITREOUS DISEASES, 2025, 45 (01): : 128 - 132
  • [39] Evaluating Intelligence and Knowledge in Large Language Models
    Bianchini, Francesco
    TOPOI-AN INTERNATIONAL REVIEW OF PHILOSOPHY, 2025, 44 (01): : 163 - 173
  • [40] Evaluating large language models for software testing
    Li, Yihao
    Liu, Pan
    Wang, Haiyang
    Chu, Jie
    Wong, W. Eric
    COMPUTER STANDARDS & INTERFACES, 2025, 93