Metrics reloaded: recommendations for image analysis validation

被引:86
|
作者
Maier-Hein, Lena [1 ,2 ,3 ,4 ,5 ]
Reinke, Annika [1 ,2 ,3 ]
Godau, Patrick [1 ,3 ,5 ]
Tizabi, Minu D. [1 ,5 ]
Buettner, Florian [6 ,7 ,8 ,9 ,10 ]
Christodoulou, Evangelia [1 ]
Glocker, Ben [11 ]
Isensee, Fabian [12 ,13 ]
Kleesiek, Jens [14 ]
Kozubek, Michal [15 ,16 ]
Reyes, Mauricio [17 ,18 ]
Riegler, Michael A. [19 ,20 ]
Wiesenfarth, Manuel [21 ]
Kavur, A. Emre [1 ,12 ,13 ]
Sudre, Carole H. [22 ,23 ,24 ]
Baumgartner, Michael [12 ]
Eisenmann, Matthias [1 ]
Heckmann-Noetzel, Doreen [1 ,5 ]
Raedsch, Tim [1 ,2 ]
Acion, Laura [25 ]
Antonelli, Michela [24 ,26 ]
Arbel, Tal [27 ,28 ]
Bakas, Spyridon [29 ,30 ]
Benis, Arriel [31 ,32 ]
Blaschko, Matthew B. [33 ]
Cardoso, M. Jorge [24 ]
Cheplygina, Veronika [34 ]
Cimini, Beth A. [35 ]
Collins, Gary S. [36 ]
Farahani, Keyvan [37 ]
Ferrer, Luciana [38 ]
Galdran, Adrian [39 ,40 ]
van Ginneken, Bram [41 ,42 ]
Haase, Robert [43 ,44 ,94 ]
Hashimoto, Daniel A. [45 ,46 ]
Hoffman, Michael M. [47 ,48 ,49 ,50 ]
Huisman, Merel [51 ]
Jannin, Pierre [52 ,53 ]
Kahn, Charles E. [54 ,55 ]
Kainmueller, Dagmar [56 ,57 ]
Kainz, Bernhard [58 ,59 ]
Karargyris, Alexandros [60 ]
Karthikesalingam, Alan [61 ]
Kofler, Florian [62 ]
Kopp-Schneider, Annette [21 ]
Kreshuk, Anna [63 ]
Kurc, Tahsin [64 ]
Landman, Bennett A. [65 ]
Litjens, Geert [66 ]
Madani, Amin [67 ]
机构
[1] German Canc Res Ctr DKFZ Heidelberg, Div Intelligent Med Syst, Heidelberg, Germany
[2] German Canc Res Ctr DKFZ Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany
[3] Heidelberg Univ, Fac Math & Comp Sci, Heidelberg, Germany
[4] Heidelberg Univ, Med Fac, Heidelberg, Germany
[5] NCT Heidelberg, DKFZ & Univ Med Ctr Heidelberg, Natl Ctr Tumor Dis NCT, Heidelberg, Germany
[6] German Canc Consortium DKTK, Partner Site Frankfurt Mainz, DKFZ & UCT Frankfurt Marburg, Frankfurt, Germany
[7] German Canc Res Ctr DKFZ Heidelberg, Heidelberg, Germany
[8] Goethe Univ Frankfurt, Dept Med, Frankfurt, Germany
[9] Goethe Univ Frankfurt, Dept Informat, Frankfurt, Germany
[10] Frankfurt Canc Inst, Frankfurt, Germany
[11] Imperial Coll London, Dept Comp, South Kensington Campus, London, England
[12] German Canc Res Ctr DKFZ Heidelberg, Div Med Image Comp, Heidelberg, Germany
[13] German Canc Res Ctr DKFZ Heidelberg, HI Appl Comp Vis Lab, Heidelberg, Germany
[14] Univ Med Essen, Inst AI Transfus Med, Essen, Germany
[15] Masaryk Univ, Ctr Biomed Image Anal, Brno, Czech Republic
[16] Masaryk Univ, Fac Informat, Brno, Czech Republic
[17] Univ Bern, ARTORG Ctr Biomed Engn Res, Bern, Switzerland
[18] Univ Bern, Dept Radiat Oncol, Univ Hosp Bern, Bern, Switzerland
[19] Simula Metropolitan Ctr Digital Engn, Oslo, Norway
[20] UiT Arctic Univ Norway, Dept Comp Sci, Tromso, Norway
[21] German Canc Res Ctr DKFZ Heidelberg, Div Biostat, Heidelberg, Germany
[22] UCL, MRC Unit Lifelong Hlth & Ageing UCL, Dept Comp Sci, London, England
[23] UCL, Ctr Med Image Comp, Dept Comp Sci, London, England
[24] Kings Coll London, Sch Biomed Engn & Imaging Sci, London, England
[25] CONICET Univ Buenos Aires, Inst Calculo, Buenos Aires, Argentina
[26] UCL, Ctr Med Image Comp, London, England
[27] McGill Univ, Ctr Intelligent Machines, Montreal, PQ, Canada
[28] McGill Univ, MILA Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
[29] Indiana Univ Sch Med, Dept Pathol & Lab Med, Div Computat Pathol, IU Hlth Informat & Translat Sci Bldg, Indianapolis, IN USA
[30] Univ Penn, Ctr Biomed Image Comp & Analyt CBICA, Philadelphia, PA USA
[31] Holon Inst Technol, Dept Digital Med Technol, Holon, Israel
[32] European Federat Med Informat, Le Mont Sur Lausanne, Switzerland
[33] Katholieke Univ Leuven, Ctr Proc Speech & Images, Leuven, Belgium
[34] IT Univ Copenhagen, Dept Comp Sci, Copenhagen, Denmark
[35] Broad Inst MIT & Harvard, Imaging Platform, Cambridge, MA USA
[36] Univ Oxford, Nuffield Orthopaed Ctr, Ctr Stat Med, Oxford, England
[37] NCI, Ctr Biomed Informat & Informat Technol, Bethesda, MD USA
[38] UBA, Inst Invest Ciencias Computac ICC, CONICET, Ciudad Autonoma De Buenos, Buenos Aires, Argentina
[39] Univ Pompeu Fabra, BCN Medtech, Barcelona, Spain
[40] Univ Adelaide, Australian Inst Machine Learning AIML, Adelaide, SA, Australia
[41] Fraunhofer MEVIS, Bremen, Germany
[42] Radboud Univ Nijmegen Med Ctr, Radboud Inst Hlth Sci, Nijmegen, Netherlands
[43] Tech Univ TU Dresden, DFG Cluster Excellence Phys Life, Dresden, Germany
[44] Ctr Syst Biol, Dresden, Germany
[45] Perelman Sch Med, Dept Surg, Philadelphia, PA USA
[46] Univ Penn, Sch Engn & Appl Sci, Gen Robot Automat Sensing & Percept Lab, Philadelphia, PA USA
[47] Univ Hlth Network, Princess Margaret Canc Ctr, Toronto, ON, Canada
[48] Univ Toronto, Dept Med Biophys, Toronto, ON, Canada
[49] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[50] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会; 欧洲研究理事会; 瑞士国家科学基金会; 荷兰研究理事会; 美国国家卫生研究院; 芬兰科学院;
关键词
HEALTH; SEGMENTATION; CRITERIA;
D O I
10.1038/s41592-023-02151-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases. Metrics Reloaded is a comprehensive framework for guiding researchers in the problem-aware selection of metrics for common tasks in biomedical image analysis.
引用
收藏
页码:195 / 212
页数:30
相关论文
共 50 条
  • [1] Metrics reloaded: recommendations for image analysis validation
    Lena Maier-Hein
    Annika Reinke
    Patrick Godau
    Minu D. Tizabi
    Florian Buettner
    Evangelia Christodoulou
    Ben Glocker
    Fabian Isensee
    Jens Kleesiek
    Michal Kozubek
    Mauricio Reyes
    Michael A. Riegler
    Manuel Wiesenfarth
    A. Emre Kavur
    Carole H. Sudre
    Michael Baumgartner
    Matthias Eisenmann
    Doreen Heckmann-Nötzel
    Tim Rädsch
    Laura Acion
    Michela Antonelli
    Tal Arbel
    Spyridon Bakas
    Arriel Benis
    Matthew B. Blaschko
    M. Jorge Cardoso
    Veronika Cheplygina
    Beth A. Cimini
    Gary S. Collins
    Keyvan Farahani
    Luciana Ferrer
    Adrian Galdran
    Bram van Ginneken
    Robert Haase
    Daniel A. Hashimoto
    Michael M. Hoffman
    Merel Huisman
    Pierre Jannin
    Charles E. Kahn
    Dagmar Kainmueller
    Bernhard Kainz
    Alexandros Karargyris
    Alan Karthikesalingam
    Florian Kofler
    Annette Kopp-Schneider
    Anna Kreshuk
    Tahsin Kurc
    Bennett A. Landman
    Geert Litjens
    Amin Madani
    Nature Methods, 2024, 21 : 195 - 212
  • [2] A Validation of Combined Metrics for Color Image Quality Assessment
    Okarma, Krzysztof
    COMPUTER VISION AND GRAPHICS, ICCVG 2014, 2014, 8671 : 1 - 8
  • [3] A validation of combined metrics for color image quality assessment
    Okarma, Krzysztof
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8671 : 1 - 8
  • [4] Performance Validation and Analysis for Multi-Method Fusion Based Image Quality Metrics in A New Image Database
    Ma, Xiaoyu
    Jiang, Xiuhua
    Pan, Da
    CHINA COMMUNICATIONS, 2019, 16 (08) : 147 - 161
  • [5] Performance Validation and Analysis for Multi-Method Fusion Based Image Quality Metrics in A New Image Database
    Xiaoyu Ma
    Xiuhua Jiang
    Da Pan
    中国通信, 2019, 16 (08) : 147 - 161
  • [6] Investigation of Image Processing Techniques in MRI Based Medical Image Analysis Methods and Validation Metrics for Brain Tumor
    Kalaiselvi, T.
    Selvi, S. Karthigai
    CURRENT MEDICAL IMAGING REVIEWS, 2018, 14 (04) : 489 - 505
  • [7] Recommendations for validation and verification of deformable image registration in radiotherapy
    Bosma, L.
    Hussein, M.
    Jameson, M.
    Asghar, S.
    Brock, K.
    McClelland, J.
    Poeta, S.
    Yuen, J.
    Zachiu, C.
    Yeo, A.
    RADIOTHERAPY AND ONCOLOGY, 2023, 182 : S505 - S506
  • [8] Validation of algorithmic CT image quality metrics with preferences of radiologists
    Cheng, Yuan
    Abadi, Ehsan
    Smith, Taylor Brunton
    Ria, Francesco
    Meyer, Mathias
    Marin, Daniele
    Samei, Ehsan
    MEDICAL PHYSICS, 2019, 46 (11) : 4837 - 4846
  • [9] Analysis and Evaluation of Image Quality Metrics
    Samajdar, Tina
    Quraishi, Md Iqbal
    INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 2, 2015, 340 : 369 - 378
  • [10] Validation Metrics Analysis of Community Detection Algorithms
    Li, Hui
    2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 2521 - 2525