Metrics reloaded: recommendations for image analysis validation

被引:86
|
作者
Maier-Hein, Lena [1 ,2 ,3 ,4 ,5 ]
Reinke, Annika [1 ,2 ,3 ]
Godau, Patrick [1 ,3 ,5 ]
Tizabi, Minu D. [1 ,5 ]
Buettner, Florian [6 ,7 ,8 ,9 ,10 ]
Christodoulou, Evangelia [1 ]
Glocker, Ben [11 ]
Isensee, Fabian [12 ,13 ]
Kleesiek, Jens [14 ]
Kozubek, Michal [15 ,16 ]
Reyes, Mauricio [17 ,18 ]
Riegler, Michael A. [19 ,20 ]
Wiesenfarth, Manuel [21 ]
Kavur, A. Emre [1 ,12 ,13 ]
Sudre, Carole H. [22 ,23 ,24 ]
Baumgartner, Michael [12 ]
Eisenmann, Matthias [1 ]
Heckmann-Noetzel, Doreen [1 ,5 ]
Raedsch, Tim [1 ,2 ]
Acion, Laura [25 ]
Antonelli, Michela [24 ,26 ]
Arbel, Tal [27 ,28 ]
Bakas, Spyridon [29 ,30 ]
Benis, Arriel [31 ,32 ]
Blaschko, Matthew B. [33 ]
Cardoso, M. Jorge [24 ]
Cheplygina, Veronika [34 ]
Cimini, Beth A. [35 ]
Collins, Gary S. [36 ]
Farahani, Keyvan [37 ]
Ferrer, Luciana [38 ]
Galdran, Adrian [39 ,40 ]
van Ginneken, Bram [41 ,42 ]
Haase, Robert [43 ,44 ,94 ]
Hashimoto, Daniel A. [45 ,46 ]
Hoffman, Michael M. [47 ,48 ,49 ,50 ]
Huisman, Merel [51 ]
Jannin, Pierre [52 ,53 ]
Kahn, Charles E. [54 ,55 ]
Kainmueller, Dagmar [56 ,57 ]
Kainz, Bernhard [58 ,59 ]
Karargyris, Alexandros [60 ]
Karthikesalingam, Alan [61 ]
Kofler, Florian [62 ]
Kopp-Schneider, Annette [21 ]
Kreshuk, Anna [63 ]
Kurc, Tahsin [64 ]
Landman, Bennett A. [65 ]
Litjens, Geert [66 ]
Madani, Amin [67 ]
机构
[1] German Canc Res Ctr DKFZ Heidelberg, Div Intelligent Med Syst, Heidelberg, Germany
[2] German Canc Res Ctr DKFZ Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany
[3] Heidelberg Univ, Fac Math & Comp Sci, Heidelberg, Germany
[4] Heidelberg Univ, Med Fac, Heidelberg, Germany
[5] NCT Heidelberg, DKFZ & Univ Med Ctr Heidelberg, Natl Ctr Tumor Dis NCT, Heidelberg, Germany
[6] German Canc Consortium DKTK, Partner Site Frankfurt Mainz, DKFZ & UCT Frankfurt Marburg, Frankfurt, Germany
[7] German Canc Res Ctr DKFZ Heidelberg, Heidelberg, Germany
[8] Goethe Univ Frankfurt, Dept Med, Frankfurt, Germany
[9] Goethe Univ Frankfurt, Dept Informat, Frankfurt, Germany
[10] Frankfurt Canc Inst, Frankfurt, Germany
[11] Imperial Coll London, Dept Comp, South Kensington Campus, London, England
[12] German Canc Res Ctr DKFZ Heidelberg, Div Med Image Comp, Heidelberg, Germany
[13] German Canc Res Ctr DKFZ Heidelberg, HI Appl Comp Vis Lab, Heidelberg, Germany
[14] Univ Med Essen, Inst AI Transfus Med, Essen, Germany
[15] Masaryk Univ, Ctr Biomed Image Anal, Brno, Czech Republic
[16] Masaryk Univ, Fac Informat, Brno, Czech Republic
[17] Univ Bern, ARTORG Ctr Biomed Engn Res, Bern, Switzerland
[18] Univ Bern, Dept Radiat Oncol, Univ Hosp Bern, Bern, Switzerland
[19] Simula Metropolitan Ctr Digital Engn, Oslo, Norway
[20] UiT Arctic Univ Norway, Dept Comp Sci, Tromso, Norway
[21] German Canc Res Ctr DKFZ Heidelberg, Div Biostat, Heidelberg, Germany
[22] UCL, MRC Unit Lifelong Hlth & Ageing UCL, Dept Comp Sci, London, England
[23] UCL, Ctr Med Image Comp, Dept Comp Sci, London, England
[24] Kings Coll London, Sch Biomed Engn & Imaging Sci, London, England
[25] CONICET Univ Buenos Aires, Inst Calculo, Buenos Aires, Argentina
[26] UCL, Ctr Med Image Comp, London, England
[27] McGill Univ, Ctr Intelligent Machines, Montreal, PQ, Canada
[28] McGill Univ, MILA Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
[29] Indiana Univ Sch Med, Dept Pathol & Lab Med, Div Computat Pathol, IU Hlth Informat & Translat Sci Bldg, Indianapolis, IN USA
[30] Univ Penn, Ctr Biomed Image Comp & Analyt CBICA, Philadelphia, PA USA
[31] Holon Inst Technol, Dept Digital Med Technol, Holon, Israel
[32] European Federat Med Informat, Le Mont Sur Lausanne, Switzerland
[33] Katholieke Univ Leuven, Ctr Proc Speech & Images, Leuven, Belgium
[34] IT Univ Copenhagen, Dept Comp Sci, Copenhagen, Denmark
[35] Broad Inst MIT & Harvard, Imaging Platform, Cambridge, MA USA
[36] Univ Oxford, Nuffield Orthopaed Ctr, Ctr Stat Med, Oxford, England
[37] NCI, Ctr Biomed Informat & Informat Technol, Bethesda, MD USA
[38] UBA, Inst Invest Ciencias Computac ICC, CONICET, Ciudad Autonoma De Buenos, Buenos Aires, Argentina
[39] Univ Pompeu Fabra, BCN Medtech, Barcelona, Spain
[40] Univ Adelaide, Australian Inst Machine Learning AIML, Adelaide, SA, Australia
[41] Fraunhofer MEVIS, Bremen, Germany
[42] Radboud Univ Nijmegen Med Ctr, Radboud Inst Hlth Sci, Nijmegen, Netherlands
[43] Tech Univ TU Dresden, DFG Cluster Excellence Phys Life, Dresden, Germany
[44] Ctr Syst Biol, Dresden, Germany
[45] Perelman Sch Med, Dept Surg, Philadelphia, PA USA
[46] Univ Penn, Sch Engn & Appl Sci, Gen Robot Automat Sensing & Percept Lab, Philadelphia, PA USA
[47] Univ Hlth Network, Princess Margaret Canc Ctr, Toronto, ON, Canada
[48] Univ Toronto, Dept Med Biophys, Toronto, ON, Canada
[49] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[50] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会; 欧洲研究理事会; 瑞士国家科学基金会; 荷兰研究理事会; 美国国家卫生研究院; 芬兰科学院;
关键词
HEALTH; SEGMENTATION; CRITERIA;
D O I
10.1038/s41592-023-02151-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases. Metrics Reloaded is a comprehensive framework for guiding researchers in the problem-aware selection of metrics for common tasks in biomedical image analysis.
引用
收藏
页码:195 / 212
页数:30
相关论文
共 50 条
  • [41] IMAGE QUALITY METRICS
    JACOBSON, RE
    JOURNAL OF PHOTOGRAPHIC SCIENCE, 1995, 43 (02): : 42 - 43
  • [42] Comparison of Structural Connectivity Metrics for Multimodal Brain Image Analysis
    Bajammal, Mohammad
    Yoldemir, Burak
    Abugharbieh, Rafeef
    2015 IEEE 12TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2015, : 934 - 937
  • [43] Metrics for image segmentation
    Rees, G
    Greenway, P
    Morray, D
    VISUAL INFORMATION PROCESSING VII, 1998, 3387 : 199 - 210
  • [44] Statistical Analysis of Image Quality Metrics for Watermark Transparency Assessment
    Phi Bang Nguyen
    Luong, Marie
    Beghdadi, Azeddine
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING-PCM 2010, PT I, 2010, 6297 : 685 - 696
  • [45] The utilization of consistency metrics for error analysis in deformable image registration
    Bender, Edward T.
    Tome, Wolfgang A.
    PHYSICS IN MEDICINE AND BIOLOGY, 2009, 54 (18): : 5561 - 5577
  • [46] ROC Curve Analysis for Validating Objective Image Fusion Metrics
    Messer, Neal
    Ezekiel, Soundararajan
    Ferris, Michael H.
    Blasch, Erik
    Alford, Mark
    Cornacchia, Maria
    Bubalo, Adnan
    2015 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2015,
  • [47] Analysis of No-Reference IQA Metrics and Their Verification for Image Databases
    Ieremeiev, Oleg
    Rubel, Andrii
    2019 IEEE 2ND UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON-2019), 2019, : 1152 - 1157
  • [48] Image metric-based correction (autocorrection) of motion effects: Analysis of image metrics
    McGee, KP
    Manduca, A
    Felmlee, JP
    Riederer, SJ
    Ehman, RL
    JMRI-JOURNAL OF MAGNETIC RESONANCE IMAGING, 2000, 11 (02): : 174 - 181
  • [49] Advanced Performance Metrics and Their Application to the Sensitivity Analysis for Model Validation and Calibration
    Agrawal, Urmila
    Etingov, Pavel
    Huang, Renke
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (05) : 4503 - 4512
  • [50] Hazard Analysis and Validation Metrics Framework for System of Systems Software Safety
    Michael, James Bret
    Shing, Man-Tak
    Cruickshank, Kristian John
    Redmond, Patrick James
    IEEE SYSTEMS JOURNAL, 2010, 4 (02): : 186 - 197