Metrics for Estimating Validity, Reliability and Bias in Peer Assessment

被引:0
|
作者
Molina-Carmona, Rafael [1 ]
Satorre-Cuerda, Rosana [1 ]
Compan-Rosique, Patricia [1 ]
Llorens-Largo, Faraon [1 ]
机构
[1] Univ Alicante, Catedra Santander UA Transformac Digital, Ctra San Vicente del Raspeig S-N, Alicante 03690, Spain
关键词
peer assessment; success rate; agreement degree; reliability; validity; bias; confusion matrix; automatic classification;
D O I
暂无
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Peer assessment is a widespread way of evaluating and rating the quality of a work in the field of education. Although it results to be a very effective learning instrument, it is subjected to possible problems of reliability, validity and some potential biases. Most works that study and try to solve these problems are focused on specific cases and the statistics for measuring reliability, validity or bias are global, that is, they give a measure of these values for the whole process, but they do not allow an individual study. In this work the approach is different. It proposes some metrics for reliability and validity of each reviewer, as well as an approximation to the possible biases that may appear in the assessment process, so that the review process can be itself assessed. An analogy between the work of a reviewer in a process of peer assessment and the operation of an automatic classifier is proposed. This has allowed us to leverage the usual measures in evaluating the quality of automatic classifiers to establish the quality of peer assessment. The reviewers are characterized by obtaining their confusion matrices and six new indicators: success rate (which estimates the validity); agreement degree (as a measure of reliability); assessment median and its interquartile range (for the estimation of central tendency and restriction of range biases); and average distance to diagonal and its standard deviation (to determine possible leniency and harshness biases). This method provides indicators of the reviewer's task and the detection of different profiles, so that the teacher can assess the work of the students as reviewers and introduce some correction mechanisms in the final assessment of the works. A practical example of application to an engineering degree is provided to illustrate the potential of the method.
引用
收藏
页码:968 / 980
页数:13
相关论文
共 50 条
  • [21] Requirement reliability metrics for risk assessment
    Jamili, AA
    SCONEST 2004: STUDENT CONFERENCE ON ENGINEERING SCIENCES AND TECHNOLOGY, 2002, : 186 - 189
  • [22] A Validity Measure for the Automated Neuropsychological Assessment Metrics
    Meyers, John E.
    Miller, Ronald Mellado
    Vincent, Andrea S.
    ARCHIVES OF CLINICAL NEUROPSYCHOLOGY, 2022, 37 (08) : 1765 - 1771
  • [23] Reliability and Validity of Key Performance Metrics of Modified 505 Test
    Zivkovic, Aleksandar
    Markovic, Srdjan
    Cuk, Ivan
    Knezevic, Olivera M.
    Mirkov, Dragan M.
    LIFE-BASEL, 2025, 15 (02):
  • [24] Validity and reliability of an ultrasound system for estimating adipose tissue
    Loenneke, Jeremy P.
    Barnes, Jeremy T.
    Wagganer, Jason D.
    Wilson, Jacob M.
    Lowery, Ryan P.
    Green, Cody E.
    Pujol, Thomas J.
    CLINICAL PHYSIOLOGY AND FUNCTIONAL IMAGING, 2014, 34 (02) : 159 - 162
  • [25] Training in Peer Assessment Affects the Validity of Assessments
    Graf, Peter
    Rawn, Catherine
    Fergusson, Jane
    Crease-Lark, Michelle
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2016, 51 : 1123 - 1123
  • [26] Evaluation of Bias in Peer Assessment in Higher Education
    Stonewall, Jacklin H.
    Dorneich, Michael C.
    Rongerude, Jane
    INTERNATIONAL JOURNAL OF ENGINEERING EDUCATION, 2024, 40 (03) : 543 - 556
  • [27] THE RELIABILITY AND VALIDITY OF A SYSTEM FOR FAMILY ASSESSMENT
    WILKINSON, IM
    STRATTON, P
    JOURNAL OF FAMILY THERAPY, 1991, 13 (01) : 73 - 94
  • [28] The reliability and validity of a brief cognitive assessment
    Cadle, CD
    Velligan, DI
    Dicocco, MA
    Bow-Thomas, CC
    Miller, AL
    SCHIZOPHRENIA RESEARCH, 2002, 53 (03) : 125 - 125
  • [29] ASSESSMENT OF THE RELIABILITY AND VALIDITY OF BIOCHEMICAL MEASURES
    DEKEYSER, FG
    PUGH, LC
    NURSING RESEARCH, 1990, 39 (05) : 314 - 317
  • [30] RELIABILITY AND VALIDITY OF THE ASSESSMENT OF MOTIVATIONAL OUTCOMES
    WIGLE, SE
    REA, D
    PARISH, TS
    PERCEPTUAL AND MOTOR SKILLS, 1989, 68 (03) : 831 - 835