Metrics reloaded: recommendations for image analysis validation

被引:86
|
作者
Maier-Hein, Lena [1 ,2 ,3 ,4 ,5 ]
Reinke, Annika [1 ,2 ,3 ]
Godau, Patrick [1 ,3 ,5 ]
Tizabi, Minu D. [1 ,5 ]
Buettner, Florian [6 ,7 ,8 ,9 ,10 ]
Christodoulou, Evangelia [1 ]
Glocker, Ben [11 ]
Isensee, Fabian [12 ,13 ]
Kleesiek, Jens [14 ]
Kozubek, Michal [15 ,16 ]
Reyes, Mauricio [17 ,18 ]
Riegler, Michael A. [19 ,20 ]
Wiesenfarth, Manuel [21 ]
Kavur, A. Emre [1 ,12 ,13 ]
Sudre, Carole H. [22 ,23 ,24 ]
Baumgartner, Michael [12 ]
Eisenmann, Matthias [1 ]
Heckmann-Noetzel, Doreen [1 ,5 ]
Raedsch, Tim [1 ,2 ]
Acion, Laura [25 ]
Antonelli, Michela [24 ,26 ]
Arbel, Tal [27 ,28 ]
Bakas, Spyridon [29 ,30 ]
Benis, Arriel [31 ,32 ]
Blaschko, Matthew B. [33 ]
Cardoso, M. Jorge [24 ]
Cheplygina, Veronika [34 ]
Cimini, Beth A. [35 ]
Collins, Gary S. [36 ]
Farahani, Keyvan [37 ]
Ferrer, Luciana [38 ]
Galdran, Adrian [39 ,40 ]
van Ginneken, Bram [41 ,42 ]
Haase, Robert [43 ,44 ,94 ]
Hashimoto, Daniel A. [45 ,46 ]
Hoffman, Michael M. [47 ,48 ,49 ,50 ]
Huisman, Merel [51 ]
Jannin, Pierre [52 ,53 ]
Kahn, Charles E. [54 ,55 ]
Kainmueller, Dagmar [56 ,57 ]
Kainz, Bernhard [58 ,59 ]
Karargyris, Alexandros [60 ]
Karthikesalingam, Alan [61 ]
Kofler, Florian [62 ]
Kopp-Schneider, Annette [21 ]
Kreshuk, Anna [63 ]
Kurc, Tahsin [64 ]
Landman, Bennett A. [65 ]
Litjens, Geert [66 ]
Madani, Amin [67 ]
机构
[1] German Canc Res Ctr DKFZ Heidelberg, Div Intelligent Med Syst, Heidelberg, Germany
[2] German Canc Res Ctr DKFZ Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany
[3] Heidelberg Univ, Fac Math & Comp Sci, Heidelberg, Germany
[4] Heidelberg Univ, Med Fac, Heidelberg, Germany
[5] NCT Heidelberg, DKFZ & Univ Med Ctr Heidelberg, Natl Ctr Tumor Dis NCT, Heidelberg, Germany
[6] German Canc Consortium DKTK, Partner Site Frankfurt Mainz, DKFZ & UCT Frankfurt Marburg, Frankfurt, Germany
[7] German Canc Res Ctr DKFZ Heidelberg, Heidelberg, Germany
[8] Goethe Univ Frankfurt, Dept Med, Frankfurt, Germany
[9] Goethe Univ Frankfurt, Dept Informat, Frankfurt, Germany
[10] Frankfurt Canc Inst, Frankfurt, Germany
[11] Imperial Coll London, Dept Comp, South Kensington Campus, London, England
[12] German Canc Res Ctr DKFZ Heidelberg, Div Med Image Comp, Heidelberg, Germany
[13] German Canc Res Ctr DKFZ Heidelberg, HI Appl Comp Vis Lab, Heidelberg, Germany
[14] Univ Med Essen, Inst AI Transfus Med, Essen, Germany
[15] Masaryk Univ, Ctr Biomed Image Anal, Brno, Czech Republic
[16] Masaryk Univ, Fac Informat, Brno, Czech Republic
[17] Univ Bern, ARTORG Ctr Biomed Engn Res, Bern, Switzerland
[18] Univ Bern, Dept Radiat Oncol, Univ Hosp Bern, Bern, Switzerland
[19] Simula Metropolitan Ctr Digital Engn, Oslo, Norway
[20] UiT Arctic Univ Norway, Dept Comp Sci, Tromso, Norway
[21] German Canc Res Ctr DKFZ Heidelberg, Div Biostat, Heidelberg, Germany
[22] UCL, MRC Unit Lifelong Hlth & Ageing UCL, Dept Comp Sci, London, England
[23] UCL, Ctr Med Image Comp, Dept Comp Sci, London, England
[24] Kings Coll London, Sch Biomed Engn & Imaging Sci, London, England
[25] CONICET Univ Buenos Aires, Inst Calculo, Buenos Aires, Argentina
[26] UCL, Ctr Med Image Comp, London, England
[27] McGill Univ, Ctr Intelligent Machines, Montreal, PQ, Canada
[28] McGill Univ, MILA Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
[29] Indiana Univ Sch Med, Dept Pathol & Lab Med, Div Computat Pathol, IU Hlth Informat & Translat Sci Bldg, Indianapolis, IN USA
[30] Univ Penn, Ctr Biomed Image Comp & Analyt CBICA, Philadelphia, PA USA
[31] Holon Inst Technol, Dept Digital Med Technol, Holon, Israel
[32] European Federat Med Informat, Le Mont Sur Lausanne, Switzerland
[33] Katholieke Univ Leuven, Ctr Proc Speech & Images, Leuven, Belgium
[34] IT Univ Copenhagen, Dept Comp Sci, Copenhagen, Denmark
[35] Broad Inst MIT & Harvard, Imaging Platform, Cambridge, MA USA
[36] Univ Oxford, Nuffield Orthopaed Ctr, Ctr Stat Med, Oxford, England
[37] NCI, Ctr Biomed Informat & Informat Technol, Bethesda, MD USA
[38] UBA, Inst Invest Ciencias Computac ICC, CONICET, Ciudad Autonoma De Buenos, Buenos Aires, Argentina
[39] Univ Pompeu Fabra, BCN Medtech, Barcelona, Spain
[40] Univ Adelaide, Australian Inst Machine Learning AIML, Adelaide, SA, Australia
[41] Fraunhofer MEVIS, Bremen, Germany
[42] Radboud Univ Nijmegen Med Ctr, Radboud Inst Hlth Sci, Nijmegen, Netherlands
[43] Tech Univ TU Dresden, DFG Cluster Excellence Phys Life, Dresden, Germany
[44] Ctr Syst Biol, Dresden, Germany
[45] Perelman Sch Med, Dept Surg, Philadelphia, PA USA
[46] Univ Penn, Sch Engn & Appl Sci, Gen Robot Automat Sensing & Percept Lab, Philadelphia, PA USA
[47] Univ Hlth Network, Princess Margaret Canc Ctr, Toronto, ON, Canada
[48] Univ Toronto, Dept Med Biophys, Toronto, ON, Canada
[49] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[50] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会; 欧洲研究理事会; 瑞士国家科学基金会; 荷兰研究理事会; 美国国家卫生研究院; 芬兰科学院;
关键词
HEALTH; SEGMENTATION; CRITERIA;
D O I
10.1038/s41592-023-02151-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases. Metrics Reloaded is a comprehensive framework for guiding researchers in the problem-aware selection of metrics for common tasks in biomedical image analysis.
引用
收藏
页码:195 / 212
页数:30
相关论文
共 50 条
  • [31] Finding the Needle in the Image Stack: Performance Metrics for Big Data Image Analysis
    Miller, Kieran
    Morreale, Patricia
    IEEE MULTIMEDIA, 2014, 21 (01) : 83 - 88
  • [32] Classification of Image Distortions using Image Quality Metrics and Linear Discriminant Analysis
    Chetouani, Aladine
    Deriche, Mohamed
    Beghdadi, Azeddine
    18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 319 - 322
  • [33] Validation of Network Communicability Metrics for the Analysis of Brain Structural Networks
    Andreotti, Jennifer
    Jann, Kay
    Melie-Garcia, Lester
    Giezendanner, Stephanie
    Abela, Eugenio
    Wiest, Roland
    Dierks, Thomas
    Federspiel, Andrea
    PLOS ONE, 2014, 9 (12):
  • [34] Validation of microarray image analysis accuracy
    Marzolf, B
    Johnson, MH
    BIOTECHNIQUES, 2004, 36 (02) : 304 - +
  • [35] Validation of Automated Image Analysis for Hematopathology
    Dangott, B.
    Ramesh, N.
    Tasdizen, T.
    Salama, M.
    LABORATORY INVESTIGATION, 2012, 92 : 391A - 391A
  • [36] Validation of Automated Image Analysis for Hematopathology
    Dangott, B.
    Ramesh, N.
    Tasdizen, T.
    Salama, M.
    MODERN PATHOLOGY, 2012, 25 : 391A - 391A
  • [37] Simulation and Validation in Brain Image Analysis
    Tohka, Jussi
    Bellec, Pierre
    Grova, Christophe
    Reilhac, Anthonin
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
  • [38] Field quality metrics - findings and recommendations
    Palmquist, Brian
    BUSINESS PROCESS MANAGEMENT JOURNAL, 2017, 23 (04) : 811 - 821
  • [39] Image Similarity Metrics in Image Registration
    Melbourne, A.
    Ridgway, G.
    Hawkes, D. J.
    MEDICAL IMAGING 2010: IMAGE PROCESSING, 2010, 7623
  • [40] Analysis of the ISIC image datasets: Usage, benchmarks and recommendations
    Cassidy, Bill
    Kendrick, Connah
    Brodzicki, Andrzej
    Jaworek-Korjakowska, Joanna
    Yap, Moi Hoon
    MEDICAL IMAGE ANALYSIS, 2022, 75