Addressing challenges in radiomics research: systematic review and repository of open-access cancer imaging datasets

被引:8
|
作者
Woznicki, Piotr [1 ]
Laqua, Fabian Christopher [1 ]
Al-Haj, Adam [2 ]
Bley, Thorsten [1 ]
Baessler, Bettina [1 ]
机构
[1] Univ Hosp Wurzburg, Dept Diagnost & Intervent Radiol, Wurzburg, Germany
[2] Med Univ Warsaw, Fac Med, Warsaw, Poland
关键词
Radiomics; Radiology; Cancer imaging; Machine learning; Reproducibility of results; BIOMARKERS; TEXTURE; MODEL; HEAD;
D O I
10.1186/s13244-023-01556-w
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
ObjectivesOpen-access cancer imaging datasets have become integral for evaluating novel AI approaches in radiology. However, their use in quantitative analysis with radiomics features presents unique challenges, such as incomplete documentation, low visibility, non-uniform data formats, data inhomogeneity, and complex preprocessing. These issues may cause problems with reproducibility and standardization in radiomics studies.MethodsWe systematically reviewed imaging datasets with public copyright licenses, published up to March 2023 across four large online cancer imaging archives. We included only datasets with tomographic images (CT, MRI, or PET), segmentations, and clinical annotations, specifically identifying those suitable for radiomics research. Reproducible preprocessing and feature extraction were performed for each dataset to enable their easy reuse.ResultsWe discovered 29 datasets with corresponding segmentations and labels in the form of health outcomes, tumor pathology, staging, imaging-based scores, genetic markers, or repeated imaging. We compiled a repository encompassing 10,354 patients and 49,515 scans. Of the 29 datasets, 15 were licensed under Creative Commons licenses, allowing both non-commercial and commercial usage and redistribution, while others featured custom or restricted licenses. Studies spanned from the early 1990s to 2021, with the majority concluding after 2013. Seven different formats were used for the imaging data. Preprocessing and feature extraction were successfully performed for each dataset.ConclusionRadiomicsHub is a comprehensive public repository with radiomics features derived from a systematic review of public cancer imaging datasets. By converting all datasets to a standardized format and ensuring reproducible and traceable processing, RadiomicsHub addresses key reproducibility and standardization challenges in radiomics.Critical relevance statementThis study critically addresses the challenges associated with locating, preprocessing, and extracting quantitative features from open-access datasets, to facilitate more robust and reliable evaluations of radiomics models.Key points- Through a systematic review, we identified 29 cancer imaging datasets suitable for radiomics research.- A public repository with collection overview and radiomics features, encompassing 10,354 patients and 49,515 scans, was compiled.- Most datasets can be shared, used, and built upon freely under a Creative Commons license.- All 29 identified datasets have been converted into a common format to enable reproducible radiomics feature extraction.Key points- Through a systematic review, we identified 29 cancer imaging datasets suitable for radiomics research.- A public repository with collection overview and radiomics features, encompassing 10,354 patients and 49,515 scans, was compiled.- Most datasets can be shared, used, and built upon freely under a Creative Commons license.- All 29 identified datasets have been converted into a common format to enable reproducible radiomics feature extraction.Key points- Through a systematic review, we identified 29 cancer imaging datasets suitable for radiomics research.- A public repository with collection overview and radiomics features, encompassing 10,354 patients and 49,515 scans, was compiled.- Most datasets can be shared, used, and built upon freely under a Creative Commons license.- All 29 identified datasets have been converted into a common format to enable reproducible radiomics feature extraction. Key points- Through a systematic review, we identified 29 cancer imaging datasets suitable for radiomics research.- A public repository with collection overview and radiomics features, encompassing 10,354 patients and 49,515 scans, was compiled.- Most datasets can be shared, used, and built upon freely under a Creative Commons license.- All 29 identified datasets have been converted into a common format to enable reproducible radiomics feature extraction.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Addressing challenges in radiomics research: systematic review and repository of open-access cancer imaging datasets
    Piotr Woznicki
    Fabian Christopher Laqua
    Adam Al-Haj
    Thorsten Bley
    Bettina Baeßler
    Insights into Imaging, 14
  • [2] The Cancer Imaging Archive: Supporting Radiomic and Imaging Genomic Research with Open-Access Data Sets
    Kirby, J.
    Tarbox, L.
    Freymann, J.
    Jaffe, C.
    Prior, F.
    MEDICAL PHYSICS, 2015, 42 (06) : 3587 - 3587
  • [3] Paving the Way for Alzheimer's Disease Prevention: A Systematic Review of Global Open-Access Neuroimaging Datasets in Healthy Individuals
    Ly, Maria
    Yu, Gary Z.
    Chwa, Won Jong
    Raji, Cyrus A.
    JOURNAL OF ALZHEIMERS DISEASE, 2023, 96 (04) : 1441 - 1451
  • [4] ADDRESSING THE GLOBAL DEMAND FOR EDUCATIONAL MATERIAL WITH AN ONLINE OPEN-ACCESS, INTERNET-BASED LECTURE REPOSITORY IN PEDIATRIC ANESTHESIA
    Kynes, J. Matthew
    Hsu, Grace
    Evans, Faye M.
    ANESTHESIA AND ANALGESIA, 2018, 126 (04): : 286 - 287
  • [5] Open-Access Physical Activity Programs for Older Adults: A Pragmatic and Systematic Review
    Balis, Laura E.
    Strayer, Thomas, III
    Ramalingam, NithyaPriya
    Wilson, Meghan
    Harden, Samantha M.
    GERONTOLOGIST, 2019, 59 (04): : E268 - E278
  • [6] The Road Not Taken: Fostering Research on the Psychology of Religiosity and Spirituality via Underused Representative, Open-Access Datasets (ROADs)
    Scott, Matthew J.
    Johnson, Kathryn A.
    Okun, Morris A.
    Cohen, Adam B.
    INTERNATIONAL JOURNAL FOR THE PSYCHOLOGY OF RELIGION, 2019, 29 (03) : 204 - 221
  • [7] Digitizing extant bat diversity: An open-access repository of 3D μCT-scanned skulls for research and education
    Shi, Jeff J.
    Westeen, Erin P.
    Rabosky, Daniel L.
    PLOS ONE, 2018, 13 (09):
  • [8] Addressing the conceptualization and measurement challenges of sustainability orientation: A systematic review and research agenda
    Khizar, Hafiz Muhammad Usman
    Iqbal, Muhammad Jawad
    Khalid, Junaid
    Adomako, Samuel
    JOURNAL OF BUSINESS RESEARCH, 2022, 142 : 718 - 743
  • [9] Ethical Complexities and Concerns Surrounding Magnetic Resonance Imaging and the Open-Access Scientific Framework in Autism Research
    Sader, Michelle
    Maloney, Ellen
    Waiter, Gordon
    Kerr-Gaffney, Jess
    Tchanturia, Kate
    Gillespie-Smith, Karri
    Duffy, Fiona
    AUTISM IN ADULTHOOD, 2024,
  • [10] Terahertz cancer imaging and sensing: open research challenges and opportunities
    Gezimati, Mavis
    Singh, Ghanshyam
    OPTICAL AND QUANTUM ELECTRONICS, 2023, 55 (08)