The Use of Bayesian Networks to Assess the Quality of Evidence from Research Synthesis: 2. Inter-Rater Reliability and Comparison with Standard GRADE Assessment

被引:7
|
作者
Llewellyn, Alexis [1 ]
Whittington, Craig [2 ]
Stewart, Gavin [3 ]
Higgins, Julian P. T. [4 ]
Meader, Nick [1 ]
机构
[1] Univ York, Ctr Reviews & Disseminat, York YO10 5DD, N Yorkshire, England
[2] UCL, Dept Clin Educ & Hlth Psychol, Ctr Outcomes Res & Effectiveness Res, London, England
[3] Newcastle Univ, Sch Agr Food & Rural Dev, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
[4] Univ Bristol, Sch Social & Community Med, Bristol, Avon, England
来源
PLOS ONE | 2015年 / 10卷 / 12期
关键词
D O I
10.1371/journal.pone.0123511
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background The grades of recommendation, assessment, development and evaluation (GRADE) approach is widely implemented in systematic reviews, health technology assessment and guideline development organisations throughout the world. We have previously reported on the development of the Semi-Automated Quality Assessment Tool (SAQAT), which enables a semi-automated validity assessment based on GRADE criteria. The main advantage to our approach is the potential to improve inter-rater agreement of GRADE assessments particularly when used by less experienced researchers, because such judgements can be complex and challenging to apply without training. This is the first study examining the inter-rater agreement of the SAQAT. Methods We conducted two studies to compare: a) the inter-rater agreement of two researchers using the SAQAT independently on 28 meta-analyses and b) the inter-rater agreement between a researcher using the SAQAT (who had no experience of using GRADE) and an experienced member of the GRADE working group conducting a standard GRADE assessment on 15 meta-analyses. Results There was substantial agreement between independent researchers using the Quality Assessment Tool for all domains (for example, overall GRADE rating: weighted kappa 0.79; 95% CI 0.65 to 0.93). Comparison between the SAQAT and a standard GRADE assessment suggested that inconsistency was parameterised too conservatively by the SAQAT. Therefore the tool was amended. Following amendment we found fair-to-moderate agreement between the standard GRADE assessment and the SAQAT (for example, overall GRADE rating: weighted kappa 0.35; 95% CI 0.09 to 0.87). Conclusions Despite a need for further research, the SAQAT may aid consistent application of GRADE, particularly by less experienced researchers.
引用
收藏
页数:11
相关论文
共 3 条
  • [1] The Use of Bayesian Networks to Assess the Quality of Evidence from Research Synthesis: 1.
    Stewart, Gavin B.
    Higgins, Julian P. T.
    Schuenemann, Holger
    Meader, Nick
    PLOS ONE, 2015, 10 (04):
  • [2] CHIMERAS showed better inter-rater reliability and inter-consensus reliability than GRADE in grading quality of evidence: A randomized controlled trial
    Wu, Xin Yin
    Chung, Vincent C. H.
    Wong, Charlene H. L.
    Yip, Benjamin H. K.
    Cheung, William K. W.
    Wu, Justin C. Y.
    EUROPEAN JOURNAL OF INTEGRATIVE MEDICINE, 2018, 23 : 116 - 122
  • [3] Validity and Inter-rater Reliability of Keyword-based Operative Report Review for Disease Severity Assessment in Pediatric Appendicitis: Results from a Multi-center Pediatric Research Consortium
    Cramm, Shannon L.
    Graham, Dionne A.
    Allukian, Myron
    Blakely, Martin
    Chandler, Nicole M.
    Cowles, Robert A.
    DeFazio, Jennifer
    Feng, Christina
    Griggs, Cornelia
    Kunisake, Shaun M.
    Lipskar, Aaron
    Russell, Robert T.
    Rangel, Shawn J.
    PEDIATRICS, 2022, 149 (01)