Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

被引:3
|
作者
Campagne, Fabien [1 ,2 ]
机构
[1] Cornell Univ, Weill Med Coll, HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsau, New York, NY 10021 USA
[2] Cornell Univ, Weill Med Coll, Dept Physiol & Biophys, New York, NY 10021 USA
关键词
Genome - Medical applications - Search engines - Information retrieval;
D O I
10.1186/1471-2105-9-132
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. Results: We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79-0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86-0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. Conclusion: Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Evaluation of quality of WGS and WES results using validated protocols
    Kvapilova, Katerina
    Misenko, Pavol
    Radvanszky, Jan
    Brzon, Ondrej
    Budis, Jaroslav
    Gazdarica, Juraj
    Pos, Ondrej
    Korabecna, Marie
    Kasny, Martin
    Szemes, Tomas
    Kvapil, Petr
    Paces, Jan
    Kozmik, Zbynek
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 1612 - 1612
  • [32] Evaluation of Protocols for Temperature Coefficient Determination
    Schujman, Sandra B.
    Mann, Jonathan R.
    Dufresne, Gary
    LaQue, Linda M.
    Rice, Crispin
    Wax, John
    Metacarpa, David J.
    Haldar, Pradeep
    2015 IEEE 42ND PHOTOVOLTAIC SPECIALIST CONFERENCE (PVSC), 2015,
  • [33] An Experimental Evaluation of BFT Protocols for Blockchains
    Jalalzai, Mohammad M.
    Richard, Golden, III
    Busch, Costas
    BLOCKCHAIN - ICBC 2019, 2019, 11521 : 34 - 48
  • [34] PRIORITIZED DEMAND ASSIGNMENT PROTOCOLS AND THEIR EVALUATION
    CHLAMTAC, I
    GANZ, A
    KOREN, Z
    IEEE TRANSACTIONS ON COMMUNICATIONS, 1988, 36 (02) : 133 - 143
  • [35] TMN communication protocols and performance evaluation
    Wu, LR
    Ho, E
    PROCEEDINGS OF THE IEEE SOUTHEASTCON '96: BRINGING TOGETHER EDUCATION, SCIENCE AND TECHNOLOGY, 1996, : 578 - 581
  • [36] Clinical Protocols for the Evaluation of Rod Function
    Stingl, Krunoslav
    Stingl, Katarina
    Nowomiejska, Katarzyna
    Kuehlewein, Laura
    Kohl, Susanne
    Kempf, Melanie
    Strasser, Torsten
    Jung, Ronja
    Wilhelm, Barbara
    Peters, Tobias
    Kelbsch, Carina
    Bartz-Schmidt, Karl Ulrich
    Langrova, Hana
    Zrenner, Eberhart
    OPHTHALMOLOGICA, 2021, 244 (05) : 396 - 407
  • [37] Evaluation of the AODV and DSR routing Protocols using the MERIT tool
    Narayan, P
    Syrotiuk, VR
    AD-HOC, MOBILE, AND WIRELESS NETWORKS, PROCEEDINGS, 2003, 2865 : 25 - 36
  • [38] Quantitative Anonymity Evaluation of Voting Protocols
    Biondi, Fabrizio
    Legay, Axel
    SOFTWARE ENGINEERING AND FORMAL METHODS, SEFM 2014, 2015, 8938 : 335 - 349
  • [39] Evaluation of Pediatric CT Doses and Protocols
    Gonzalez, E.
    Beasley, C.
    Firestine, K.
    John, S.
    Wagner, L.
    AMERICAN JOURNAL OF ROENTGENOLOGY, 2010, 194 (05)
  • [40] Evaluation framework for multicast ordering protocols
    Mayer, Erwin
    Computer Communications Review, 1992, 22 (04):