Feedback-Generation for Programming Exercises With GPT-4

被引：6

作者：

Azaiz, Imen ^{[1
]}

Kiesler, Natalie ^{[2
]}

Strickroth, Sven ^{[1
]}

机构：

[1] Ludwig Maximilians Univ Munchen, Munich, Germany

[2] Nuremberg Tech, Nurnberg, Germany

来源：

PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024 | 2024年

关键词：

formative feedback; personalized feedback; assessment; introductory programming; Large Language Models; LLMs; GPT-4; Turbo; benchmarking;

D O I：

10.1145/3649217.3653594

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. LLMs such as Codex, GPT-3.5, and GPT 4 have shown promising results in the context of large programming courses, where students can benefit from feedback and hints if provided timely and at scale. This paper explores the quality of GPT-4 Turbo's generated output for prompts containing both the programming task specification and a student's submission as input. Two assignments from an introductory programming course were selected, and GPT-4 was asked to generate feedback for 55 randomly chosen, authentic student programming submissions. The output was qualitatively analyzed regarding correctness, personalization, fault localization, and other features identified in the material. Compared to prior work and analyses of GPT-3.5, GPT-4 Turbo shows notable improvements. For example, the output is more structured and consistent. GPT-4 Turbo can also accurately identify invalid casing in student programs' output. In some cases, the feedback also includes the output of the student program. At the same time, inconsistent feedback was noted such as stating that the submission is correct but an error needs to be fixed. The present work increases our understanding of LLMs' potential, limitations, and how to integrate them into e-assessment systems, pedagogical scenarios, and instructing students who are using applications based on GPT-4.

引用

页码：31 / 37

页数：7

共 50 条

[41] If in a Crowdsourced Data Annotation Pipeline, a GPT-4
He, Zeyu
Huang, Chieh-Yang
Ding, Chien-Kuang Cornelia
Rohatgi, Shaurya
Huang, Ting-Hao 'Kenneth'
PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
[42] Exploring the capabilities of large language models for the generation of safety cases: the case of GPT-4
Sivakumar, Mithila
Belle, Alvine Boaye
Shan, Jinjun
Shahandashti, Kimya Khakzad
32ND INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS, REW 2024, 2024, : 35 - 45
[43] Harnessing GPT-4 for generation of cybersecurity GRC policies: A focus on ransomware attack mitigation
McIntosh, Timothy
Liu, Tong
Susnjak, Teo
Alavizadeh, Hooman
Ng, Alex
Nowrozy, Raza
Watters, Paul
COMPUTERS & SECURITY, 2023, 134
[44] Teaching Plan Generation and Evaluation With GPT-4: Unleashing the Potential of LLM in Instructional Design
Hu, Bihao
Zheng, Longwei
Zhu, Jiayi
Ding, Lishan
Wang, Yilei
Gu, Xiaoqing
IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2024, 17 : 1471 - 1485
[45] Utilizing GPT-4 to interpret oral mucosal disease photographs for structured report generation
Zhan, Zheng-Zhe
Xiong, Yu-Tao
Wang, Chen-Yuan
Zhang, Bao-Tian
Lian, Wen-Jun
Zeng, Yu-Min
Liu, Wei
Tang, Wei
Liu, Chang
SCIENTIFIC REPORTS, 2025, 15 (01):
[46] Enhancing radiology training with GPT-4: Pilot analysis of automated feedback in trainee preliminary reports
Bala, Wasif
Li, Hanzhou
Moon, John
Trivedi, Hari
Gichoya, Judy
Balthazar, Patricia
CURRENT PROBLEMS IN DIAGNOSTIC RADIOLOGY, 2025, 54 (02) : 151 - 158
[47] GPT-4 in Nuclear Medicine Education: Does It Outperform GPT-3.5?
Currie, Geoffrey M.
JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (04) : 314 - 317
[48] Correspondence on Chat GPT-4, GPT-3.5 and drug information queries
Kleebayoon, Amnuay
Wiwanitkit, Viroj
JOURNAL OF TELEMEDICINE AND TELECARE, 2023,
[49] Is GPT-4 capable of passing MIR 2023? Comparison between GPT-4 and ChatGPT-3 in the MIR 2022 and 2023 exams
Cerame, Alvaro
Juaneda, Juan
Estrella-Porter, Pablo
de la Puente, Lucia
Navarro, Joaquin
Garcia, Eva
Sanchez, Domingo A.
Carrasco, Juan Pablo
SPANISH JOURNAL OF MEDICAL EDUCATION, 2024, 5 (02):
[50] A GPT-4 Reticular Chemist for Guiding MOF Discovery
Zheng, Zhiling
Rong, Zichao
Rampal, Nakul
Borgs, Christian
Chayes, Jennifer T.
Yaghi, Omar M.
ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2023, 62 (46)

← 1 2 3 4 5 →