Towards Automated Multiple Choice Question Generation and Evaluation: Aligning with Bloom's Taxonomy

被引:1
|
作者
Hwang, Kevin [1 ]
Wang, Kenneth [1 ]
Alomair, Maryam [2 ]
Choa, Fow-Sen [2 ]
Chen, Lujie Karen [2 ]
机构
[1] Glenelg High Sch, Glenelg, MD 21737 USA
[2] Univ Maryland Baltimore Cty, Baltimore, MD 21250 USA
关键词
automated question generation; GPT-4; Bloom's taxonomy; large language models; multiple choice question generation; ITEM WRITING FLAWS;
D O I
10.1007/978-3-031-64299-9_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple Choice Questions (MCQs) are frequently used for educational assessments for their efficiency in grading and providing feedback. However, manually generatingMCQs has some limitations and challenges. This study explores an AI-driven approach to creating and evaluating Bloom's Taxonomy-aligned college-level biology MCQs using a varied number of shots in few-shot prompting with GPT-4. Shots, or examples of correct prompt-response pairs, were sourced from previously published datasets containing educator-approved MCQs labeled with their Bloom's taxonomy and were matched to prompts via a maximal marginal relevance search. To obtain ground truths to compare GPT-4 against, three expert human evaluators with a minimum of 4 years of educational experience annotated a random sample of the generated questions with regards to relevance to the input prompt, classroom usability, and perceived Bloom's Taxonomy level. Furthermore, we explored the feasibility of an AI-driven evaluation approach that can rate question usability using the Item Writing Flaws (IWFs) framework. We conclude that GPT-4 generally shows promise in generating relevant and usable questions. However, more work needs to be done to improve Bloom-level alignment accuracy (accuracy of alignment between GPT-4's target level and the actual level of the generated question). Moreover, we note that a general inverse relationship exists between alignment accuracy and number of shots. On the other hand, no clear trend between shot number and relevance/usability was observed. These findings shed light on automated question generation and assessment, presenting the potential for advancements in AI-driven educational evaluation methods.
引用
收藏
页码:389 / 396
页数:8
相关论文
共 50 条
  • [41] EVALUATION OF WRITTEN EXAMINATION QUESTIONS OF TURKISH LANGUAGE IN ACCORDANCE WITH BLOOM'S TAXONOMY
    Gocer, Ali
    CROATIAN JOURNAL OF EDUCATION-HRVATSKI CASOPIS ZA ODGOJ I OBRAZOVANJE, 2011, 13 (02): : 161 - 183
  • [42] Automatic item generation based on artificial intelligence and item model with Bloom's taxonomy
    Chen, Po-Hsi
    Hsieh, Chia-En
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2024, 59 : 534 - 535
  • [43] USTW Vs. STW: A Comparative Analysis for Exam Question Classification based on Bloom’s Taxonomy
    Gani M.O.
    Ayyasamy R.K.
    Fui T.
    Sangodiah A.
    Mendel, 2022, 28 (02) : 25 - 40
  • [44] A Rule-based Approach in Bloom's Taxonomy Question Classification through Natural Language Processing
    Haris, Syahidah Sufi
    Omar, Nazlia
    2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 410 - 414
  • [45] Methodology for automated generation of multiple choice questions in self-assessment
    Sanz-Lobera, Alfredo
    Gonzalez Roig, Alfredo
    Gonzalez Requena, Ignacio
    RESEARCH IN ENGINEERING EDUCATION SYMPOSIUM, 2011, : 61 - 69
  • [46] Integrating the revised bloom's taxonomy with multiple intelligences: A planning tool for curriculum differentiation
    Noble, T
    TEACHERS COLLEGE RECORD, 2004, 106 (01): : 193 - 211
  • [47] Tapping into Bloom Taxonomy's Higher-Order Cognitive Processes: The Case for Multiple Choice Questions as a Valid Assessment Tool in the ESP Classroom
    Lenchuk, Iryna
    Ahmed, Amer
    ARAB WORLD ENGLISH JOURNAL, 2021, : 160 - 171
  • [48] Learning to Reuse Distractors to Support Multiple-Choice Question Generation in Education
    Bitew, Semere Kiros
    Hadifar, Amir
    Sterckx, Lucas
    Deleu, Johannes
    Develder, Chris
    Demeester, Thomas
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2024, 17 : 375 - 390
  • [49] Automatic Generation of a Large Multiple-Choice Question-Answer Corpus
    Kauchak, David
    Song, Vivien
    Mishra, Prashant
    Leroy, Gondy
    Harber, Phil
    Rains, Stephen
    Hamre, John
    Morgenstein, Nick
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, INTELLISYS 2024, 2024, 1066 : 55 - 72
  • [50] Automatic Chinese Multiple Choice Question Generation Using Mixed Similarity Strategy
    Liu, Ming
    Rus, Vasile
    Liu, Li
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2018, 11 (02): : 193 - 202