Estimating reliability under a generalizability theory model for test scores composed of testlets

被引：31

作者：

Lee, GM

Frisbie, DA

机构：

[1] CTB, Monterey, CA 93940 USA

[2] Univ Iowa, Iowa Testing Programs, Iowa City, IA 52242 USA

来源：

APPLIED MEASUREMENT IN EDUCATION | 1999年 / 12卷 / 03期

关键词：

D O I：

10.1207/S15324818AME1203_2

中图分类号：

G40 [教育学];

学科分类号：

040101 ; 120403 ;

摘要：

Previous studies have indicated that the reliability of lest scores composed of testlets might be overestimated by conventional item-based reliability estimation methods (Anastasi, 1988; Sireci, Thissen, & Wainer, 1991; Thorndike, 1951; Wainer, 1995; Wainer & Thissen, 1996). We designed this study to investigate the appropriateness and implications of using a generalizability theory (G-theory) approach to estimating the reliability of scores from tests composed of testlets. The magnitude of overestimation from using Cronbach's alpha based on item scores in this situation was found to be about 0.04 relative to the testlet approach with G-theory. The generalizability coefficients based on varying numbers of passages and a fixed total number of items were found to be more variable than when the number of passages was fixed, the total number of items was fixed, and the Dumber of items per passage varied. Therefore, manipulating the number of passages is a more productive way to obtain efficient measurement procedures than manipulating the number of items within each passage.

引用

页码：237 / 255

页数：19

共 50 条

[31] Using generalizability theory to investigate the variability and reliability of EFL composition scores by human raters and e-rater
Sari, Elif
Han, Turgay
PORTA LINGUARUM, 2022, (38) : 27 - 45
[32] Using generalizability theory and the ERP reliability analysis (ERA) toolbox for assessing test-retest reliability of ERP scores part 2: Application to food-based tasks and stimuli
Carbine, Kaylie A.
Clayson, Peter E.
Baldwin, Scott A.
LeCheminant, James D.
Larson, Michael J.
INTERNATIONAL JOURNAL OF PSYCHOPHYSIOLOGY, 2021, 166 : 188 - 198
[33] Reliability of gain scores under realistic assumptions about properties of pre-test and post-test scores
Zimmerman, DW
Williams, RH
BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 1998, 51 : 343 - 351
[34] Interval Estimation Procedures for True Scores of a Test Composed of Polytomous Items: An Application of the Multinomial Error Model
Kim, Kyung Yong
Park, Seohee
Lee, Won-Chan
PSYCHOLOGICAL METHODS, 2021, 26 (03) : 343 - 356
[35] A CAUTIONARY NOTE ON ESTIMATING THE RELIABILITY OF A MASTERY TEST WITH THE BETA-BINOMIAL MODEL
WILCOX, RR
APPLIED PSYCHOLOGICAL MEASUREMENT, 1981, 5 (04) : 531 - 537
[36] Estimating Model Performance Under Domain Shifts with Class-Specific Confidence Scores
Li, Zeju
Kamnitsas, Konstantinos
Islam, Mobarakol
Chen, Chen
Glocker, Ben
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VII, 2022, 13437 : 693 - 703
[37] The comparison of the scores obtained by Bayesian nonparametric model and classical test theory methods*
Yurtcu, Meltem
Kelecioglu, Hulya
Boone, Edward L.
SCIENCE PROGRESS, 2021, 104 (03)
[38] Development and validation of the peptic ulcer scale under the system of quality of life instruments for chronic diseases based on classical test theory and generalizability theory
Chonghua Wan
Ying Chen
Li Gao
Qingqing Zhang
Peng Quan
Xiaoyuan Sun
BMC Gastroenterology, 20
[39] Development and validation of the peptic ulcer scale under the system of quality of life instruments for chronic diseases based on classical test theory and generalizability theory
Wan, Chonghua
Chen, Ying
Gao, Li
Zhang, Qingqing
Quan, Peng
Sun, Xiaoyuan
BMC GASTROENTEROLOGY, 2020, 20 (01)
[40] PROPERTIES OF EXPONENTIAL SCORES TEST FOR K-SAMPLE PROBLEM UNDER LEHMANN MODEL
BURR, P
YOUNG, DH
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 79 - 85

← 1 2 3 4 5 →