A Comparison of the Relative Performance of Four IRT Models on Equating Passage-Based Tests

被引：4

作者：

Kim, Kyung Yong ^{[1
]}

Lim, Euijin ^{[2
]}

Lee, Won-Chan ^{[3
]}

机构：

[1] Univ North Carolina Greensboro, Educ Res Methodol, Greensboro, NC 27412 USA

[2] Seoul Natl Univ, TEPS Ctr, Language Educ Inst, Seoul, South Korea

[3] Univ Iowa, CASMA, Iowa City, IA 52242 USA

来源：

INTERNATIONAL JOURNAL OF TESTING | 2019年 / 19卷 / 03期

关键词：

equating; item response theory; bifactor model; testlet response theory model;

D O I：

10.1080/15305058.2018.1530239

中图分类号：

C [社会科学总论];

学科分类号：

03 ; 0303 ;

摘要：

For passage-based tests, items that belong to a common passage often violate the local independence assumption of unidimensional item response theory (UIRT). In this case, ignoring local item dependence (LID) and estimating item parameters using a UIRT model could be problematic because doing so might result in inaccurate parameter estimates, which, in turn, could impact the results of equating. Under the random groups design, the main purpose of this article was to compare the relative performance of the three-parameter logistic (3PL), graded response (GR), bifactor, and testlet models on equating passage-based tests when various degrees of LID were present due to passage. Simulation results showed that the testlet model produced the most accurate equating results, followed by the bifactor model. The 3PL model worked as well as the bifactor and testlet models when the degree of LID was low but returned less accurate equating results than the two multidimensional models as the degree of LID increased. Among the four models, the polytomous GR model provided the least accurate equating results.

引用

页码：248 / 269

页数：22

共 50 条

[21] Comparison of four light-response models using relative curvature measures of nonlinearity
He, Ke
Wang, Lin
Ratkowsky, David A.
Shi, Peijian
SCIENTIFIC REPORTS, 2024, 14 (01):
[22] Short-term Electricity Load Forecast Performance Comparison Based on Four Neural Network Models
Wang Jie-sheng
Zhu Qing-wen
2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 2952 - 2956
[23] Comparison of swirling effects on ejector performance using four turbulence models
Zhou, B.
Fleck, B.A.
Bouak, F.
Gauthier, J.E.D.
Canadian Aeronautics and Space Journal, 2000, 46 (04): : 178 - 182
[24] Detecting significant change in neuropsychological test performance: A comparison of four models
Temkin, NR
Heaton, RK
Grant, I
Dikmen, SS
JOURNAL OF THE INTERNATIONAL NEUROPSYCHOLOGICAL SOCIETY, 1999, 5 (04) : 357 - 369
[25] A Component-Based Performance Comparison of Four Hypervisors
Hwang, Jinho
Zeng, Sai
Wu, Frederick Y.
Wood, Timothy
2013 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2013), 2013, : 269 - 276
[26] Bayesian Model Selection Methods for Multilevel IRT Models: A Comparison of Five DIC-Based Indices
Zhang, Xue
Tao, Jian
Wang, Chun
Shi, Ning-Zhong
JOURNAL OF EDUCATIONAL MEASUREMENT, 2019, 56 (01) : 3 - 27
[27] Comparison of the relative toxicity relationships based on batch and continuous algal toxicity tests
Chen, CY
Lin, KC
Yang, DT
CHEMOSPHERE, 1997, 35 (09) : 1959 - 1965
[28] Comparison of four performance models in quantifying the inequality of leaf and fruit size distribution
Wang, Lin
He, Ke
Hui, Cang
Ratkowsky, David A.
Yao, Weihao
Lian, Meng
Wang, Jinfeng
Shi, Peijian
ECOLOGY AND EVOLUTION, 2024, 14 (03):
[29] Investigating radar relative calibration biases based on four-dimensional reflectivity comparison
Seo, Bong-Chul
Krajewski, Witold F.
Smith, James A.
WEATHER RADAR AND HYDROLOGY, 2012, 351 : 375 - +
[30] PERFORMANCE COMPARISON OF THREE AND FOUR CELL SOLID PHASE TESTS FOR RED CELL ANTIBODY SCREENING
Fleiter, B.
VOX SANGUINIS, 2010, 99 : 355 - 355

← 1 2 3 4 5 →