Feedback-Generation for Programming Exercises With GPT-4

被引：6

作者：

Azaiz, Imen ^{[1
]}

Kiesler, Natalie ^{[2
]}

Strickroth, Sven ^{[1
]}

机构：

[1] Ludwig Maximilians Univ Munchen, Munich, Germany

[2] Nuremberg Tech, Nurnberg, Germany

来源：

PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024 | 2024年

关键词：

formative feedback; personalized feedback; assessment; introductory programming; Large Language Models; LLMs; GPT-4; Turbo; benchmarking;

D O I：

10.1145/3649217.3653594

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education. LLMs such as Codex, GPT-3.5, and GPT 4 have shown promising results in the context of large programming courses, where students can benefit from feedback and hints if provided timely and at scale. This paper explores the quality of GPT-4 Turbo's generated output for prompts containing both the programming task specification and a student's submission as input. Two assignments from an introductory programming course were selected, and GPT-4 was asked to generate feedback for 55 randomly chosen, authentic student programming submissions. The output was qualitatively analyzed regarding correctness, personalization, fault localization, and other features identified in the material. Compared to prior work and analyses of GPT-3.5, GPT-4 Turbo shows notable improvements. For example, the output is more structured and consistent. GPT-4 Turbo can also accurately identify invalid casing in student programs' output. In some cases, the feedback also includes the output of the student program. At the same time, inconsistent feedback was noted such as stating that the submission is correct but an error needs to be fixed. The present work increases our understanding of LLMs' potential, limitations, and how to integrate them into e-assessment systems, pedagogical scenarios, and instructing students who are using applications based on GPT-4.

引用

页码：31 / 37

页数：7

共 50 条

[21] GPT-4 Performance for Neurologic Localization
Lee, Jung-Hyun
Choi, Eunhee
McDougal, Robert
Lytton, William W.
NEUROLOGY-CLINICAL PRACTICE, 2024, 14 (03)
[22] ChatGPT/GPT-4 and Spinal Surgeons
Kleebayoon, Amnuay
Wiwanitkit, Viroj
ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (08) : 1657 - 1657
[23] Is GPT-4 a Good Data Analyst?
Cheng, Liying
Li, Xingxuan
Bing, Lidong
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9496 - 9514
[24] PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4
Abukhalaf, Seif
Hamdaqa, Mohammad
Khomh, Foutse
PROCEEDINGS 2024 IEEE/ACM FIRST INTERNATIONAL CONFERENCE ON AI FOUNDATION MODELS AND SOFTWARE ENGINEERING, FORGE 2024, 2024, : 108 - 118
[25] Dynamic Reconfiguring of GPT-4 Based Tutors to Become GPT-4 Based Teachers in Underserved Areas in Africa and the Environs
Butgereit, Laurie
Abugosseisa, Muna Mahmoud
Elbashir, Mohammed
International Conference on Artificial Intelligence, Computer, Data Sciences, and Applications, ACDSA 2024, 2024,
[26] GPT-4 in Radiology: Improvements in Advanced Reasoning
Bhayana, Rajesh
Bleakney, Robert R.
Krishna, Satheesh
RADIOLOGY, 2023, 307 (05)
[27] Performance of GPT-4 on Chinese Nursing Examination
Miao, Yiqun
Luo, Yuan
Zhao, Yuhan
Li, Jiawei
Liu, Mingxuan
Wang, Huiying
Chen, Yuling
Wu, Ying
NURSE EDUCATOR, 2024, 49 (06) : E338 - E343
[28] A Systematic Literature Review of Automated Feedback Generation for Programming Exercises
Keuning, Hieke
Jeuring, Johan
Heeren, Bastiaan
ACM TRANSACTIONS ON COMPUTING EDUCATION, 2019, 19 (01):
[29] Automated Financial Analysis Using GPT-4
Noels, Sander
Merlevede, Adriaan
Fecheyr, Andrew
Vanhalst, Maarten
Meerlaen, Nick
Viaene, Sebastien
De Bie, Tijl
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2023, PT VII, 2023, 14175 : 345 - 349
[30] Using GPT-4 to Generate Failure Logic
Clegg, Kester
Habli, Ibrahim
McDermid, John
COMPUTER SAFETY, RELIABILITY, AND SECURITY. SAFECOMP 2024 WORKSHOPS, 2024, 14989 : 148 - 159

← 1 2 3 4 5 →