Towards Automated Multiple Choice Question Generation and Evaluation: Aligning with Bloom's Taxonomy

被引:1
|
作者
Hwang, Kevin [1 ]
Wang, Kenneth [1 ]
Alomair, Maryam [2 ]
Choa, Fow-Sen [2 ]
Chen, Lujie Karen [2 ]
机构
[1] Glenelg High Sch, Glenelg, MD 21737 USA
[2] Univ Maryland Baltimore Cty, Baltimore, MD 21250 USA
关键词
automated question generation; GPT-4; Bloom's taxonomy; large language models; multiple choice question generation; ITEM WRITING FLAWS;
D O I
10.1007/978-3-031-64299-9_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple Choice Questions (MCQs) are frequently used for educational assessments for their efficiency in grading and providing feedback. However, manually generatingMCQs has some limitations and challenges. This study explores an AI-driven approach to creating and evaluating Bloom's Taxonomy-aligned college-level biology MCQs using a varied number of shots in few-shot prompting with GPT-4. Shots, or examples of correct prompt-response pairs, were sourced from previously published datasets containing educator-approved MCQs labeled with their Bloom's taxonomy and were matched to prompts via a maximal marginal relevance search. To obtain ground truths to compare GPT-4 against, three expert human evaluators with a minimum of 4 years of educational experience annotated a random sample of the generated questions with regards to relevance to the input prompt, classroom usability, and perceived Bloom's Taxonomy level. Furthermore, we explored the feasibility of an AI-driven evaluation approach that can rate question usability using the Item Writing Flaws (IWFs) framework. We conclude that GPT-4 generally shows promise in generating relevant and usable questions. However, more work needs to be done to improve Bloom-level alignment accuracy (accuracy of alignment between GPT-4's target level and the actual level of the generated question). Moreover, we note that a general inverse relationship exists between alignment accuracy and number of shots. On the other hand, no clear trend between shot number and relevance/usability was observed. These findings shed light on automated question generation and assessment, presenting the potential for advancements in AI-driven educational evaluation methods.
引用
收藏
页码:389 / 396
页数:8
相关论文
共 50 条
  • [21] A Model for Course Backward Design: Aligning Outcomes and Assessments with Bloom's Taxonomy and Vision & Change
    Ross, J. A.
    INTEGRATIVE AND COMPARATIVE BIOLOGY, 2018, 58 : E409 - E409
  • [22] Improving project management curriculum by aligning course learning outcomes with Bloom's taxonomy framework
    Karanja, Erastus
    Malone, Laurell C.
    JOURNAL OF INTERNATIONAL EDUCATION IN BUSINESS, 2021, 14 (02) : 197 - 218
  • [23] BloomLLM: Large Language Models Based Question Generation Combining Supervised Fine-Tuning and Bloom's Taxonomy
    Nghia Duong-Trung
    Wang, Xia
    Kravcik, Milos
    TECHNOLOGY ENHANCED LEARNING FOR INCLUSIVE AND EQUITABLE QUALITY EDUCATION, PT II, EC-TEL 2024, 2024, 15160 : 93 - 98
  • [24] Using Bloom's Taxonomy to teach sustainability in multiple contexts
    Pappas, E.
    Pierrakos, O.
    Nagel, R.
    JOURNAL OF CLEANER PRODUCTION, 2013, 48 : 54 - 64
  • [25] Ontology-Based Multiple Choice Question Generation
    Alsubait, Tahani
    Parsia, Bijan
    Sattler, Ulrike
    KUNSTLICHE INTELLIGENZ, 2016, 30 (02): : 183 - 188
  • [26] Ontology-Based Multiple Choice Question Generation
    Al-Yahya, Maha
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [27] An effective deep learning pipeline for improved question classification into bloom's taxonomy's domains
    Sharma, Harsh
    Mathur, Rohan
    Chintala, Tejas
    Dhanalakshmi, Samiappan
    Senthil, Ramalingam
    EDUCATION AND INFORMATION TECHNOLOGIES, 2023, 28 (05) : 5105 - 5145
  • [28] An effective deep learning pipeline for improved question classification into bloom’s taxonomy’s domains
    Harsh Sharma
    Rohan Mathur
    Tejas Chintala
    Samiappan Dhanalakshmi
    Ramalingam Senthil
    Education and Information Technologies, 2023, 28 : 5105 - 5145
  • [29] Automatic Multiple Choice Question Generation From Text: A Survey
    Rao, Dhawaleswar C. H.
    Saha, Sujan Kumar
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2020, 13 (01): : 14 - 25
  • [30] Formative and Summative Automated Assessment with Multiple-Choice Question Banks
    Beerepoot, Maarten T. P.
    JOURNAL OF CHEMICAL EDUCATION, 2023, 100 (08) : 2947 - 2955