Crowdsourcing the Evaluation of Multiple-Choice Questions Using Item-Writing Flaws and Bloom's Taxonomy

被引：2

作者：

Moore, Steven ^{[1
]}

Fang, Ellen ^{[1
]}

Nguyen, Huy A. ^{[1
]}

Stamper, John ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Human Comp Interact, Pittsburgh, PA 15213 USA

来源：

PROCEEDINGS OF THE TENTH ACM CONFERENCE ON LEARNING @ SCALE, L@S 2023 | 2023年

关键词：

crowdsourcing; learnersourcing; question evaluation; question quality; question generation; QUALITY;

D O I：

10.1145/3573051.3593396

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Multiple-choice questions, which are widely used in educational assessments, have the potential to negatively impact student learning and skew analytics when they contain item-writing flaws. Existing methods for evaluating multiple-choice questions in educational contexts tend to focus primarily on machine readability metrics, such as grammar, syntax, and formatting, without considering the intended use of the questions within course materials and their pedagogical implications. In this study, we present the results of crowdsourcing the evaluation of multiple-choice questions based on 15 common item-writing flaws. Through analysis of 80 crowdsourced evaluations on questions from the domains of calculus and chemistry, we found that crowdworkers were able to accurately evaluate the questions, matching 75% of the expert evaluations across multiple questions. They were able to correctly distinguish between two levels of Bloom's Taxonomy for the calculus questions, but were less accurate for chemistry questions. We discuss how to scale this question evaluation process and the implications it has across other domains. This work demonstrates how crowdworkers can be leveraged in the quality evaluation of educational questions, regardless of prior experience or domain knowledge.

引用

页码：25 / 34

页数：10

共 50 条

[41] COMPARISON OF SHORT AND MULTIPLE-CHOICE QUESTIONS IN EVALUATION OF STUDENTS OF BIOCHEMISTRY
FORSDYKE, DR
MEDICAL EDUCATION, 1978, 12 (05) : 351 - 356
[42] Item Analysis of Multiple-choice Questions (MCQs): Assessment Tool For Quality Assurance Measures
Elgadal, Amani H.
Mariod, Abdalbasit A.
SUDAN JOURNAL OF MEDICAL SCIENCES, 2021, 16 (03): : 334 - 346
[43] INVERSIONS IN TRUE-FALSE AND IN MULTIPLE-CHOICE QUESTIONS - NEW FORM OF ITEM ANALYSIS
KOESLAG, JH
MELZER, CW
SCHACH, SR
MEDICAL EDUCATION, 1979, 13 (06) : 420 - 424
[44] Theoretical evaluation of partial credit scoring of the multiple-choice test item
Persson, Rasmus A. X.
METRON-INTERNATIONAL JOURNAL OF STATISTICS, 2023, 81 (02): : 143 - 161
[45] Theoretical evaluation of partial credit scoring of the multiple-choice test item
Rasmus A. X. Persson
METRON, 2023, 81 : 143 - 161
[46] Using cognitive models to develop quality multiple-choice questions
Pugh, Debra
De Champlain, Andre
Gierl, Mark
Lai, Hollis
Touchie, Claire
MEDICAL TEACHER, 2016, 38 (08) : 838 - 843
[47] Assessing declarative and procedural knowledge using multiple-choice questions
Abu-Zaid, Ahmed
Khan, Tehreem A.
MEDICAL EDUCATION ONLINE, 2013, 18
[48] Automated Assessment with Multiple-choice Questions using Weighted Answers
Zampirolli, Francisco de Assis
Batista, Valerio Ramos
Rodriguez, Carla
da Rocha, Rafaela Vilela
Goya, Denise
CSEDU: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED EDUCATION - VOL 1, 2021, : 254 - 261
[49] Identification of technical item flaws leads to improvement of the quality of single best Multiple Choice Questions
Khan, Humaira Fayyaz
Danish, Khalid Farooq
Awan, Azra Saeed
Anwar, Masood
PAKISTAN JOURNAL OF MEDICAL SCIENCES, 2013, 29 (03) : 715 - 718
[50] SYSTEM FOR MARKING MULTIPLE-CHOICE QUESTIONS AND PERFORMING ITEM ANALYSIS USING A PROGRAMMABLE CALCULATOR AND INCORPORATING A NEW INDEX OF DISCRIMINATORY ABILITY OF QUESTIONS
JACKSON, JR
JOURNAL OF ANATOMY, 1972, 113 (NOV) : 301 - &

← 1 2 3 4 5 →