Understanding metric-related pitfalls in image analysis validation

被引:34
|
作者
Reinke, Annika [1 ,2 ,3 ]
Tizabi, Minu D. [1 ,4 ]
Baumgartner, Michael [5 ]
Eisenmann, Matthias [1 ]
Heckmann-Noetzel, Doreen [1 ,4 ]
Kavur, A. Emre [1 ,5 ,6 ]
Raedsch, Tim [1 ,2 ]
Sudre, Carole H. [7 ,8 ,9 ]
Acion, Laura [10 ]
Antonelli, Michela [9 ,11 ]
Arbel, Tal [12 ,13 ]
Bakas, Spyridon [14 ,15 ]
Benis, Arriel [16 ,17 ]
Buettner, Florian [18 ,19 ,20 ,21 ,22 ]
Cardoso, M. Jorge [9 ]
Cheplygina, Veronika [23 ]
Chen, Jianxu [24 ]
Christodoulou, Evangelia [1 ]
Cimini, Beth A. [25 ]
Farahani, Keyvan [26 ]
Ferrer, Luciana [27 ]
Galdran, Adrian [28 ,29 ]
van Ginneken, Bram [30 ,31 ]
Glocker, Ben [32 ]
Godau, Patrick [1 ,3 ,4 ]
Hashimoto, Daniel A. [33 ,34 ]
Hoffman, Michael M. [35 ,36 ,37 ,38 ]
Huisman, Merel [39 ]
Isensee, Fabian [5 ,6 ]
Jannin, Pierre [40 ,41 ]
Kahn, Charles E. [42 ,43 ]
Kainmueller, Dagmar [44 ,45 ,46 ]
Kainz, Bernhard [47 ,48 ]
Karargyris, Alexandros [49 ]
Kleesiek, Jens [50 ]
Kofler, Florian [51 ]
Kooi, Thijs [52 ]
Kopp-Schneider, Annette [53 ]
Kozubek, Michal [54 ,55 ]
Kreshuk, Anna [56 ]
Kurc, Tahsin [57 ]
Landman, Bennett A. [58 ]
Litjens, Geert [59 ]
Madani, Amin [60 ]
Maier-Hein, Klaus [5 ,61 ]
Martel, Anne L. [36 ,62 ]
Meijering, Erik [63 ]
Menze, Bjoern [64 ]
Moons, Karel G. M. [65 ]
Mueller, Henning [66 ,67 ]
机构
[1] German Canc Res Ctr DKFZ Heidelberg, Div Intelligent Med Syst, Heidelberg, Germany
[2] German Canc Res Ctr DKFZ Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany
[3] Heidelberg Univ, Fac Math & Comp Sci, Heidelberg, Germany
[4] NCT Heidelberg, Natl Ctr Tumor Dis NCT, Heidelberg, Germany
[5] German Canc Res Ctr DKFZ Heidelberg, Div Med Image Comp, Heidelberg, Germany
[6] German Canc Res Ctr DKFZ Heidelberg, HI Appl Comp Vis Lab, Heidelberg, Germany
[7] UCL, MRC Unit Lifelong Hlth & Ageing, London, England
[8] UCL, Dept Comp Sci, Ctr Med Image Comp, London, England
[9] Kings Coll London, Sch Biomed Engn & Imaging Sci, London, England
[10] Univ Buenos Aires, Inst Calculo, CONICET, Buenos Aires, DF, Argentina
[11] UCL, Ctr Med Image Comp, London, England
[12] McGill Univ, Ctr Intelligent Machines, Montreal, PQ, Canada
[13] McGill Univ, MILA Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
[14] Indiana Univ Sch Med, Dept Pathol & Lab Med, Div Computat Pathol, Indianapolis, IN 46202 USA
[15] Univ Penn, Ctr Biomed Image Comp & Analyt CBICA, Philadelphia, PA 19104 USA
[16] Holon Inst Technol, Dept Digital Med Technol, Holon, Israel
[17] European Federat Med Informat, Le Mt Sur Lausanne, Switzerland
[18] German Canc Consortium DKTK, Partner Site Frankfurt Mainz, Frankfurt, Germany
[19] German Canc Res Ctr DKFZ Heidelberg, Heidelberg, Germany
[20] Goethe Univ Frankfurt, Dept Med, Frankfurt, Germany
[21] Goethe Univ Frankfurt, Dept Informat, Frankfurt, Germany
[22] Frankfurt Canc Insititute, Frankfurt, Germany
[23] IT Univ Copenhagen, Dept Comp Sci, Copenhagen, Denmark
[24] Leibniz Inst Analyt Wissensch ISAS eV, Dortmund, Germany
[25] Broad Inst MIT & Harvard, Imaging Platform, Cambridge, MA 02142 USA
[26] NCI, Ctr Biomed Informat & Informat Technol, Bethesda, MD 20892 USA
[27] UBA, CONICET, Inst Invest Ciencias Computac ICC, Buenos Aires, DF, Argentina
[28] Univ Pompeu Fabra, Barcelona, Spain
[29] Univ Adelaide, Adelaide, SA, Australia
[30] Fraunhofer MEVIS, Bremen, Germany
[31] Radboud Univ Nijmegen, Med Ctr, Radboud Inst Hlth Sci, Nijmegen, Netherlands
[32] Imperial Coll London, Dept Comp, South Kensington Campus, London, England
[33] Perelman Sch Med, Dept Surg, Philadelphia, PA USA
[34] Univ Penn, Sch Engn & Appl Sci, Gen Robot Automat Sensing & Percept Lab, Philadelphia, PA 19104 USA
[35] Univ Hlth Network, Princess Margaret Canc Ctr, Toronto, ON, Canada
[36] Univ Toronto, Dept Med Biophys, Toronto, ON, Canada
[37] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
[38] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[39] Radboud Univ Nijmegen, Med Ctr, Dept Radiol & Nucl Med, Nijmegen, Netherlands
[40] Univ Rennes 1, Lab Traitement Signal & Image, UMR S 1099, Rennes, France
[41] INSERM, Paris, France
[42] Univ Penn, Dept Radiol, Philadelphia, PA 19104 USA
[43] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[44] Max Delbruck Ctr Mol Med, Helmholtz Assoc MDC, Biomed Image Anal, Berlin, Germany
[45] HI Helmholtz Imaging, Berlin, Germany
[46] Univ Potsdam, Digital Engn Fac, Potsdam, Germany
[47] Imperial Coll London, Fac Engn, Dept Comp, London, England
[48] Friedrich Alexander Univ, Dept AIBE, Erlangen, Germany
[49] IHU Strasbourg, Strasbourg, France
[50] Univ Med Essen, Inst AI Med IKIM, Translat Image Guided Oncol TIO, Essen, Germany
基金
荷兰研究理事会; 瑞士国家科学基金会; 英国工程与自然科学研究理事会; 欧洲研究理事会; 加拿大自然科学与工程研究理事会; 英国惠康基金; 美国国家卫生研究院; “创新英国”项目; 芬兰科学院;
关键词
SEGMENTATION;
D O I
10.1038/s41592-023-02150-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation. This Perspective presents a reliable and comprehensive source of information on pitfalls related to validation metrics in image analysis, with an emphasis on biomedical imaging.
引用
收藏
页码:182 / 194
页数:20
相关论文
共 50 条
  • [1] Understanding metric-related pitfalls in image analysis validation
    Annika Reinke
    Minu D. Tizabi
    Michael Baumgartner
    Matthias Eisenmann
    Doreen Heckmann-Nötzel
    A. Emre Kavur
    Tim Rädsch
    Carole H. Sudre
    Laura Acion
    Michela Antonelli
    Tal Arbel
    Spyridon Bakas
    Arriel Benis
    Florian Buettner
    M. Jorge Cardoso
    Veronika Cheplygina
    Jianxu Chen
    Evangelia Christodoulou
    Beth A. Cimini
    Keyvan Farahani
    Luciana Ferrer
    Adrian Galdran
    Bram van Ginneken
    Ben Glocker
    Patrick Godau
    Daniel A. Hashimoto
    Michael M. Hoffman
    Merel Huisman
    Fabian Isensee
    Pierre Jannin
    Charles E. Kahn
    Dagmar Kainmueller
    Bernhard Kainz
    Alexandros Karargyris
    Jens Kleesiek
    Florian Kofler
    Thijs Kooi
    Annette Kopp-Schneider
    Michal Kozubek
    Anna Kreshuk
    Tahsin Kurc
    Bennett A. Landman
    Geert Litjens
    Amin Madani
    Klaus Maier-Hein
    Anne L. Martel
    Erik Meijering
    Bjoern Menze
    Karel G. M. Moons
    Henning Müller
    Nature Methods, 2024, 21 : 182 - 194
  • [2] LEVERAGING EVALUATION METRIC-RELATED TRAINING CRITERIA FOR SPEECH SUMMARIZATION
    Lin, Shih-Hsiang
    Chang, Yu-Mei
    Liu, Jia-Wen
    Chen, Berlin
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5314 - 5317
  • [3] Extractive speech summarization using evaluation metric-related training criteria
    Chen, Berlin
    Lin, Shih-Hsiang
    Chang, Yu-Mei
    Liu, Jia-Wen
    INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (01) : 1 - 12
  • [4] EVALUATION METRIC FOR IMAGE UNDERSTANDING
    Hemery, Baptiste
    Laurent, Helene
    Rosenberger, Christophe
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 40 - 40
  • [5] Evaluation metric of an image understanding result
    Hemery, Baptiste
    Laurent, Helene
    Emile, Bruno
    Rosenberger, Christophe
    JOURNAL OF ELECTRONIC IMAGING, 2015, 24 (01)
  • [6] Parametrization of an image understanding quality metric with a subjective evaluation
    Hemery, B.
    Laurent, H.
    Emile, B.
    Rosenberger, C.
    PATTERN RECOGNITION LETTERS, 2013, 34 (05) : 511 - 518
  • [7] Subjective tests for image fusion evaluation and objective metric validation
    Petrovic, Vladimir
    INFORMATION FUSION, 2007, 8 (02) : 208 - 216
  • [8] Statistical validation metric for accuracy assessment in medical image segmentation
    Popovic, Aleksandra
    de la Fuente, Matias
    Engelhardt, Martin
    Radermacher, Klaus
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2007, 2 (3-4) : 169 - 181
  • [9] Statistical validation metric for accuracy assessment in medical image segmentation
    Aleksandra Popovic
    Matías de la Fuente
    Martin Engelhardt
    Klaus Radermacher
    International Journal of Computer Assisted Radiology and Surgery, 2007, 2 : 169 - 181
  • [10] Image segmentation metric and its application in the analysis of microscopic image
    Ma B.-Y.
    Jiang S.-F.
    Yin D.
    Shen H.-K.
    Ban X.-J.
    Huang H.-Y.
    Wang H.
    Xue W.-H.
    Feng H.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2021, 43 (01): : 137 - 149