Automated Assessment of Students' Code Comprehension using LLMs

被引：0

作者：

Oli, Priti ^{[1
]}

Banjade, Rabin ^{[1
]}

Chapagain, Jeevan ^{[1
]}

Rus, Vasile ^{[1
]}

机构：

[1] Univ Memphis, Memphis, TN 38152 USA

来源：

AI FOR EDUCATION WORKSHOP | 2024年 / 257卷

关键词：

Automated Assessment; Large Language Model; Code Comprehension; Self-Explanation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Assessing students' answers, particularly natural language answers, is a crucial challenge in the field of education. Advances in transformer-based models such as Large Language Models (LLMs), have led to significant progress in various natural language tasks. Nevertheless, amidst the growing trend of evaluating LLMs across diverse tasks, evaluating LLMs in the realm of automated answer assessment has not received much attention. To address this gap, we explore the potential of using LLMs for automated assessment of student's short and open-ended answers in program comprehension tasks. Particularly, we use LLMs to compare students' explanations with expert explanations in the context of line-by-line explanations of computer programs. For comparison purposes, we assess both decoder-only Large Language Models (LLMs) and encoder-based Semantic Textual Similarity (STS) models in the context of assessing the correctness of students' explanation of computer code. Our findings indicate that decoder-only LLMs, when prompted in few-shot and chain-of-thought setting perform comparable to fine-tuned encoder-based models in evaluating students' short answers in the programming domain.

引用

页码：118 / 128

页数：11

共 50 条

[31] Reading and comprehension: phoniatric assessment in students with reading difficulties
Franchi, Vanessa Magosso
Guerra, Monica Elisabeth Simons
Novaes, Beatriz Cavalcanti Albuquerque Caiuby
Favero, Mariana Lopes
Pirana, Sulene
BRAZILIAN JOURNAL OF OTORHINOLARYNGOLOGY, 2023, 89 (01) : 3 - 13
[32] Qiskit Code Assistant: Training LLMs for generating Quantum Computing Code
Dupuis, Nicolas
Buratti, Luca
Vishwakarma, Sanjay
Forrat, Aitana Viudes
Kremer, David
Faro, Ismael
Puri, Ruchir
Cruz-Benito, Juan
2024 IEEE LLM AIDED DESIGN WORKSHOP, LAD 2024, 2024,
[33] PTGroup: An Automated Penetration Testing Framework Using LLMs and Multiple Prompt Chains
Wu, Lei
Zhong, Xiaofeng
Liu, Jingju
Wang, Xiang
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IX, ICIC 2024, 2024, 14870 : 220 - 232
[34] Decoding Source Code Comprehension: Bottlenecks Experienced by Senior Computer Science Students
Khomokhoana, Pakiso J.
Nel, Liezel
ICT EDUCATION, 2020, 1136 : 17 - 32
[35] Method-Level Bug Severity Prediction using Source Code Metrics and LLMs
Mashhadi, Ehsan
Ahmadvand, Hossein
Hemmati, Hadi
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 635 - 646
[36] Tackling Students' Coding Assignments with LLMs
Dingle, Adam
Krulis, Martin
2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024, 2024, : 94 - 101
[37] AN INVESTIGATION OF SILENT VERSUS ALOUD READING COMPREHENSION OF ELEMENTARY STUDENTS USING MAZE ASSESSMENT PROCEDURES
Hale, Andrea D.
Hawkins, Renee O.
Sheeley, Wesley
Reynolds, Jennifer R.
Jenkins, Shonna
Schmitt, Ara J.
Martin, Daniel A.
PSYCHOLOGY IN THE SCHOOLS, 2011, 48 (01) : 4 - 13
[38] An Investigation into the Automated Assessment of the Design-Code Interface
Hayes, Alan
Thomas, Pete
Smith, Neil
Waugh, Kevin
ITICSE 2007: 12TH ANNUAL CONFERENCE ON INNOVATION & TECHNOLOGY IN COMPUTER SCIENCE EDUCATION: INCLUSIVE EDUCATION IN COMPUTER SCIENCE, 2007, : 324 - 324
[39] An investigation into the automated assessment of the design-code interface
Department of Computing, University of Wales, Newport, NP20 5XR
不详
Annu. Conf. Innov. Technol. Comput. Sci. Educ. Incl. Educ. Comput. Sci., (324):
[40] On the automated assessment of nuclear reactor systems code accuracy
Kunz, RF
Kasmala, GF
Mahaffy, JH
Murray, CJ
NUCLEAR ENGINEERING AND DESIGN, 2002, 211 (2-3) : 245 - 272

← 1 2 3 4 5 →