Automated Assessment of Students' Code Comprehension using LLMs

被引：0

作者：

Oli, Priti ^{[1
]}

Banjade, Rabin ^{[1
]}

Chapagain, Jeevan ^{[1
]}

Rus, Vasile ^{[1
]}

机构：

[1] Univ Memphis, Memphis, TN 38152 USA

来源：

AI FOR EDUCATION WORKSHOP | 2024年 / 257卷

关键词：

Automated Assessment; Large Language Model; Code Comprehension; Self-Explanation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Assessing students' answers, particularly natural language answers, is a crucial challenge in the field of education. Advances in transformer-based models such as Large Language Models (LLMs), have led to significant progress in various natural language tasks. Nevertheless, amidst the growing trend of evaluating LLMs across diverse tasks, evaluating LLMs in the realm of automated answer assessment has not received much attention. To address this gap, we explore the potential of using LLMs for automated assessment of student's short and open-ended answers in program comprehension tasks. Particularly, we use LLMs to compare students' explanations with expert explanations in the context of line-by-line explanations of computer programs. For comparison purposes, we assess both decoder-only Large Language Models (LLMs) and encoder-based Semantic Textual Similarity (STS) models in the context of assessing the correctness of students' explanation of computer code. Our findings indicate that decoder-only LLMs, when prompted in few-shot and chain-of-thought setting perform comparable to fine-tuned encoder-based models in evaluating students' short answers in the programming domain.

引用

页码：118 / 128

页数：11

共 50 条

[21] The Effect of Reading Code Aloud on Comprehension: An Empirical Study with School Students
Swidan, Alaaeddin
Hermans, Felienne
PROCEEDINGS OF THE ACM CONFERENCE ON GLOBAL COMPUTING EDUCATION (COMPED '19), 2019, : 178 - 184
[22] Towards Automated Code Assessment with OpenJupyter in MOOCs
Elhayany, Mohamed
Meinel, Christoph
PROCEEDINGS OF THE TENTH ACM CONFERENCE ON LEARNING @ SCALE, L@S 2023, 2023, : 321 - 325
[23] MATTEST - A SYSTEM FOR AUTOMATED ASSESSMENT OF MATLAB CODE
Kovacevic, Milos
Klem, Nikola
Nedeljkovic, Dorde
INTERNATIONAL JOURNAL ON INFORMATION TECHNOLOGIES AND SECURITY, 2011, 3 (04): : 43 - 50
[24] Bringing Together Manual and Automated Code Assessment
Pribela, Ivan
Pracner, Doni
Budimac, Zoran
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014), 2015, 1648
[25] An Empirical Study of the Code Generation of Safety-Critical Software Using LLMs
Liu, Mingxing
Wang, Junfeng
Lin, Tao
Ma, Quan
Fang, Zhiyang
Wu, Yanqun
APPLIED SCIENCES-BASEL, 2024, 14 (03):
[26] Overall Writing Effectiveness: Exploring Students' Use of LLMs, Pushing the Limits of Automated Text Generation
Wilbers, Simon
Groepler, Johanna
Prell, Bastian
Reiff-Stephan, Joerg
SMART TECHNOLOGIES FOR A SUSTAINABLE FUTURE, VOL 2, STE 2024, 2024, 1028 : 11 - 22
[27] Automated Think-Aloud Protocol for Identifying Students with Reading Comprehension Impairment Using Sentence Embedding
Yoo, Yongseok
APPLIED SCIENCES-BASEL, 2024, 14 (02):
[28] On Evaluating the Efficiency of Source Code Generated by LLMs
Niu, Changan
Zhang, Ting
Li, Chuanyi
Luo, Bin
Ng, Vincent
PROCEEDINGS 2024 IEEE/ACM FIRST INTERNATIONAL CONFERENCE ON AI FOUNDATION MODELS AND SOFTWARE ENGINEERING, FORGE 2024, 2024, : 103 - 107
[29] Improvised Software Code Comprehension Using Data Mining
Gupta, Ram Gopal
Dumka, Ankur
Mazumdar, Bireshwar Dass
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2024, 21 (03) : 531 - 547
[30] Reading and comprehension: phoniatric assessment in students with reading difficulties
Franchi, Vanessa Magosso
Guerra, Monica Elisabeth Simons
Novaes, Beatriz Cavalcanti Albuquerque Caiuby
Favero, Mariana Lopes
Pirana, Sulene
BRAZILIAN JOURNAL OF OTORHINOLARYNGOLOGY, 2023, 89 (01) : 3 - 13

← 1 2 3 4 5 →