Automated Assessment of Students' Code Comprehension using LLMs

被引:0
|
作者
Oli, Priti [1 ]
Banjade, Rabin [1 ]
Chapagain, Jeevan [1 ]
Rus, Vasile [1 ]
机构
[1] Univ Memphis, Memphis, TN 38152 USA
来源
AI FOR EDUCATION WORKSHOP | 2024年 / 257卷
关键词
Automated Assessment; Large Language Model; Code Comprehension; Self-Explanation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Assessing students' answers, particularly natural language answers, is a crucial challenge in the field of education. Advances in transformer-based models such as Large Language Models (LLMs), have led to significant progress in various natural language tasks. Nevertheless, amidst the growing trend of evaluating LLMs across diverse tasks, evaluating LLMs in the realm of automated answer assessment has not received much attention. To address this gap, we explore the potential of using LLMs for automated assessment of student's short and open-ended answers in program comprehension tasks. Particularly, we use LLMs to compare students' explanations with expert explanations in the context of line-by-line explanations of computer programs. For comparison purposes, we assess both decoder-only Large Language Models (LLMs) and encoder-based Semantic Textual Similarity (STS) models in the context of assessing the correctness of students' explanation of computer code. Our findings indicate that decoder-only LLMs, when prompted in few-shot and chain-of-thought setting perform comparable to fine-tuned encoder-based models in evaluating students' short answers in the programming domain.
引用
收藏
页码:118 / 128
页数:11
相关论文
共 50 条
  • [31] Reading and comprehension: phoniatric assessment in students with reading difficulties
    Franchi, Vanessa Magosso
    Guerra, Monica Elisabeth Simons
    Novaes, Beatriz Cavalcanti Albuquerque Caiuby
    Favero, Mariana Lopes
    Pirana, Sulene
    BRAZILIAN JOURNAL OF OTORHINOLARYNGOLOGY, 2023, 89 (01) : 3 - 13
  • [32] Qiskit Code Assistant: Training LLMs for generating Quantum Computing Code
    Dupuis, Nicolas
    Buratti, Luca
    Vishwakarma, Sanjay
    Forrat, Aitana Viudes
    Kremer, David
    Faro, Ismael
    Puri, Ruchir
    Cruz-Benito, Juan
    2024 IEEE LLM AIDED DESIGN WORKSHOP, LAD 2024, 2024,
  • [33] PTGroup: An Automated Penetration Testing Framework Using LLMs and Multiple Prompt Chains
    Wu, Lei
    Zhong, Xiaofeng
    Liu, Jingju
    Wang, Xiang
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IX, ICIC 2024, 2024, 14870 : 220 - 232
  • [34] Decoding Source Code Comprehension: Bottlenecks Experienced by Senior Computer Science Students
    Khomokhoana, Pakiso J.
    Nel, Liezel
    ICT EDUCATION, 2020, 1136 : 17 - 32
  • [35] Method-Level Bug Severity Prediction using Source Code Metrics and LLMs
    Mashhadi, Ehsan
    Ahmadvand, Hossein
    Hemmati, Hadi
    2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 635 - 646
  • [36] Tackling Students' Coding Assignments with LLMs
    Dingle, Adam
    Krulis, Martin
    2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024, 2024, : 94 - 101
  • [37] AN INVESTIGATION OF SILENT VERSUS ALOUD READING COMPREHENSION OF ELEMENTARY STUDENTS USING MAZE ASSESSMENT PROCEDURES
    Hale, Andrea D.
    Hawkins, Renee O.
    Sheeley, Wesley
    Reynolds, Jennifer R.
    Jenkins, Shonna
    Schmitt, Ara J.
    Martin, Daniel A.
    PSYCHOLOGY IN THE SCHOOLS, 2011, 48 (01) : 4 - 13
  • [38] An Investigation into the Automated Assessment of the Design-Code Interface
    Hayes, Alan
    Thomas, Pete
    Smith, Neil
    Waugh, Kevin
    ITICSE 2007: 12TH ANNUAL CONFERENCE ON INNOVATION & TECHNOLOGY IN COMPUTER SCIENCE EDUCATION: INCLUSIVE EDUCATION IN COMPUTER SCIENCE, 2007, : 324 - 324
  • [39] An investigation into the automated assessment of the design-code interface
    Department of Computing, University of Wales, Newport, NP20 5XR
    不详
    Annu. Conf. Innov. Technol. Comput. Sci. Educ. Incl. Educ. Comput. Sci., (324):
  • [40] On the automated assessment of nuclear reactor systems code accuracy
    Kunz, RF
    Kasmala, GF
    Mahaffy, JH
    Murray, CJ
    NUCLEAR ENGINEERING AND DESIGN, 2002, 211 (2-3) : 245 - 272