Feedback-Generation for Programming Exercises With GPT-4

被引:6
|
作者
Azaiz, Imen [1 ]
Kiesler, Natalie [2 ]
Strickroth, Sven [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Munich, Germany
[2] Nuremberg Tech, Nurnberg, Germany
关键词
formative feedback; personalized feedback; assessment; introductory programming; Large Language Models; LLMs; GPT-4; Turbo; benchmarking;
D O I
10.1145/3649217.3653594
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. LLMs such as Codex, GPT-3.5, and GPT 4 have shown promising results in the context of large programming courses, where students can benefit from feedback and hints if provided timely and at scale. This paper explores the quality of GPT-4 Turbo's generated output for prompts containing both the programming task specification and a student's submission as input. Two assignments from an introductory programming course were selected, and GPT-4 was asked to generate feedback for 55 randomly chosen, authentic student programming submissions. The output was qualitatively analyzed regarding correctness, personalization, fault localization, and other features identified in the material. Compared to prior work and analyses of GPT-3.5, GPT-4 Turbo shows notable improvements. For example, the output is more structured and consistent. GPT-4 Turbo can also accurately identify invalid casing in student programs' output. In some cases, the feedback also includes the output of the student program. At the same time, inconsistent feedback was noted such as stating that the submission is correct but an error needs to be fixed. The present work increases our understanding of LLMs' potential, limitations, and how to integrate them into e-assessment systems, pedagogical scenarios, and instructing students who are using applications based on GPT-4.
引用
收藏
页码:31 / 37
页数:7
相关论文
共 50 条
  • [41] If in a Crowdsourced Data Annotation Pipeline, a GPT-4
    He, Zeyu
    Huang, Chieh-Yang
    Ding, Chien-Kuang Cornelia
    Rohatgi, Shaurya
    Huang, Ting-Hao 'Kenneth'
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [42] Exploring the capabilities of large language models for the generation of safety cases: the case of GPT-4
    Sivakumar, Mithila
    Belle, Alvine Boaye
    Shan, Jinjun
    Shahandashti, Kimya Khakzad
    32ND INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS, REW 2024, 2024, : 35 - 45
  • [43] Harnessing GPT-4 for generation of cybersecurity GRC policies: A focus on ransomware attack mitigation
    McIntosh, Timothy
    Liu, Tong
    Susnjak, Teo
    Alavizadeh, Hooman
    Ng, Alex
    Nowrozy, Raza
    Watters, Paul
    COMPUTERS & SECURITY, 2023, 134
  • [44] Teaching Plan Generation and Evaluation With GPT-4: Unleashing the Potential of LLM in Instructional Design
    Hu, Bihao
    Zheng, Longwei
    Zhu, Jiayi
    Ding, Lishan
    Wang, Yilei
    Gu, Xiaoqing
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2024, 17 : 1471 - 1485
  • [45] Utilizing GPT-4 to interpret oral mucosal disease photographs for structured report generation
    Zhan, Zheng-Zhe
    Xiong, Yu-Tao
    Wang, Chen-Yuan
    Zhang, Bao-Tian
    Lian, Wen-Jun
    Zeng, Yu-Min
    Liu, Wei
    Tang, Wei
    Liu, Chang
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [46] Enhancing radiology training with GPT-4: Pilot analysis of automated feedback in trainee preliminary reports
    Bala, Wasif
    Li, Hanzhou
    Moon, John
    Trivedi, Hari
    Gichoya, Judy
    Balthazar, Patricia
    CURRENT PROBLEMS IN DIAGNOSTIC RADIOLOGY, 2025, 54 (02) : 151 - 158
  • [47] GPT-4 in Nuclear Medicine Education: Does It Outperform GPT-3.5?
    Currie, Geoffrey M.
    JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (04) : 314 - 317
  • [48] Correspondence on Chat GPT-4, GPT-3.5 and drug information queries
    Kleebayoon, Amnuay
    Wiwanitkit, Viroj
    JOURNAL OF TELEMEDICINE AND TELECARE, 2023,
  • [49] Is GPT-4 capable of passing MIR 2023? Comparison between GPT-4 and ChatGPT-3 in the MIR 2022 and 2023 exams
    Cerame, Alvaro
    Juaneda, Juan
    Estrella-Porter, Pablo
    de la Puente, Lucia
    Navarro, Joaquin
    Garcia, Eva
    Sanchez, Domingo A.
    Carrasco, Juan Pablo
    SPANISH JOURNAL OF MEDICAL EDUCATION, 2024, 5 (02):
  • [50] A GPT-4 Reticular Chemist for Guiding MOF Discovery
    Zheng, Zhiling
    Rong, Zichao
    Rampal, Nakul
    Borgs, Christian
    Chayes, Jennifer T.
    Yaghi, Omar M.
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2023, 62 (46)