Towards Automated Multiple Choice Question Generation and Evaluation: Aligning with Bloom's Taxonomy

被引:1
|
作者
Hwang, Kevin [1 ]
Wang, Kenneth [1 ]
Alomair, Maryam [2 ]
Choa, Fow-Sen [2 ]
Chen, Lujie Karen [2 ]
机构
[1] Glenelg High Sch, Glenelg, MD 21737 USA
[2] Univ Maryland Baltimore Cty, Baltimore, MD 21250 USA
关键词
automated question generation; GPT-4; Bloom's taxonomy; large language models; multiple choice question generation; ITEM WRITING FLAWS;
D O I
10.1007/978-3-031-64299-9_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple Choice Questions (MCQs) are frequently used for educational assessments for their efficiency in grading and providing feedback. However, manually generatingMCQs has some limitations and challenges. This study explores an AI-driven approach to creating and evaluating Bloom's Taxonomy-aligned college-level biology MCQs using a varied number of shots in few-shot prompting with GPT-4. Shots, or examples of correct prompt-response pairs, were sourced from previously published datasets containing educator-approved MCQs labeled with their Bloom's taxonomy and were matched to prompts via a maximal marginal relevance search. To obtain ground truths to compare GPT-4 against, three expert human evaluators with a minimum of 4 years of educational experience annotated a random sample of the generated questions with regards to relevance to the input prompt, classroom usability, and perceived Bloom's Taxonomy level. Furthermore, we explored the feasibility of an AI-driven evaluation approach that can rate question usability using the Item Writing Flaws (IWFs) framework. We conclude that GPT-4 generally shows promise in generating relevant and usable questions. However, more work needs to be done to improve Bloom-level alignment accuracy (accuracy of alignment between GPT-4's target level and the actual level of the generated question). Moreover, we note that a general inverse relationship exists between alignment accuracy and number of shots. On the other hand, no clear trend between shot number and relevance/usability was observed. These findings shed light on automated question generation and assessment, presenting the potential for advancements in AI-driven educational evaluation methods.
引用
收藏
页码:389 / 396
页数:8
相关论文
共 50 条
  • [31] What faculty write versus what students see? Perspectives on multiple-choice questions using Bloom's taxonomy
    Monrad, Seetha U.
    Zaidi, Nikki L. Bibler
    Grob, Karri L.
    Kurtz, Joshua B.
    Tai, Andrew W.
    Hortsch, Michael
    Gruppen, Larry D.
    Santen, Sally A.
    MEDICAL TEACHER, 2021, 43 (05) : 575 - 582
  • [32] Agile evaluation of the complexity of user stories using the Bloom's Taxonomy
    Edgar Castillo-Barrera, F.
    Amador-Garcia, Monica
    Perez-Gonzalez, Hector G.
    Martinez-Perez, Francisco E.
    PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2017, : 1047 - 1050
  • [33] A TAXONOMY ASSESSMENT AND ITEM ANALYSIS OF A RETAILING MANAGEMENT MULTIPLE-CHOICE QUESTION BANK
    Dickinson, John R.
    Marketing Dynamism & Sustainability-Things Change, Things Stay the Same..., 2015, : 329 - 330
  • [34] Six Sigma learning evaluation model using Bloom's Taxonomy
    Amorim, Gabriela Fonseca
    Balestrassi, Pedro Paulo
    Sawhney, Rapinder
    de Oliveira-Abans, Mariangela
    Ferreira da Silva, Diogo Leonardo
    INTERNATIONAL JOURNAL OF LEAN SIX SIGMA, 2018, 9 (01) : 156 - 174
  • [35] Evaluation of Engineering Course Content by Bloom's Taxonomy: A Case Study
    Romanovs, Andrejs
    Soshko, Oksana
    Merkuryev, Yuri
    Novickis, Leonids
    WORKSHOPS ON BUSINESS INFORMATICS RESEARCH, 2012, 106 : 158 - +
  • [36] Customized Pedagogical Recommendation Using Automated Planning for Sequencing Based on Bloom's Taxonomy
    Costa, Newarney Torrezao
    de Almeida, Denis Jose
    Oliveira, Gustavo Prado
    Fernandes, Marcia Aparecida
    INTERNATIONAL JOURNAL OF DISTANCE EDUCATION TECHNOLOGIES, 2022, 20 (01)
  • [37] Automatic Applying Bloom's Taxonomy to Classify and Analysis the Cognition level of English Question Items
    Chang, Wen-Chih
    Chung, Ming-Shun
    JCPC: 2009 JOINT CONFERENCE ON PERVASIVE COMPUTING, 2009, : 727 - 733
  • [38] Automatic Multiple-Choice Question Generation from Thai Text
    Kwankajornkiet, Chonlathorn
    Suchato, Atiwong
    Punyabukkana, Proadpran
    2016 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2016, : 308 - 313
  • [39] An Evaluation of E-Learning on the Basis of Bloom's Taxonomy: An Exploratory Study
    Halawi, Leila
    McCarthy, Richard
    Pires, Sandra
    JOURNAL OF EDUCATION FOR BUSINESS, 2009, 84 (06) : 374 - 380
  • [40] Computer-aided generation of item banks based on ontology and Bloom's taxonomy
    Ying, Ming-Hsiung
    Yang, Heng-Li
    ADVANCES IN WEB BASED LEARNING - ICWL 2008, PROCEEDINGS, 2008, 5145 : 157 - +