An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population

被引:14
作者
Nind, Thomas [1 ]
Sutherland, James [1 ]
McAllister, Gordon [1 ]
Hardy, Douglas [1 ]
Hume, Ally [2 ]
MacLeod, Ruairidh [2 ]
Caldwell, Jacqueline [3 ]
Krueger, Susan [1 ]
Tramma, Leandro [1 ]
Teviotdale, Ross [1 ]
Abdelatif, Mohammed [1 ]
Gillen, Kenny [1 ]
Ward, Joe [1 ]
Scobbie, Donald [2 ]
Baillie, Ian [3 ]
Brooks, Andrew [2 ]
Prodan, Bianca [2 ]
Kerr, William [2 ]
Sloan-Murphy, Dominic [2 ]
Herrera, Juan F. R. [2 ]
McManus, Dan [2 ]
Morris, Carole [3 ]
Sinclair, Carol [4 ]
Baxter, Rob [2 ]
Parsons, Mark [2 ]
Morris, Andrew [5 ]
Jefferson, Emily [1 ]
机构
[1] Univ Dundee, Sch Med, Hlth Informat Ctr HIC, Second Floor,Level 7,Mailbox 15,Ninewells Hosp &, Dundee DD1 9SY2, Scotland
[2] Univ Edinburgh, Edinburgh Parallel Comp Ctr EPCC, Bayes Ctr, 47 Potterrow, Edinburgh EH8 9BT, Midlothian, Scotland
[3] Elect Data Res & Innovat Serv eDRIS, Publ Hlth Scotland PHS, Nine Edinburgh Bioquarter, Little France Rd, Edinburgh EH16 4UX, Midlothian, Scotland
[4] Publ Hlth Scotland PHS, Data Driven Innovat, Gyle Sq,1 South Gyle Crescent, Edinburgh EH12 9EB, Midlothian, Scotland
[5] Hlth Data Res HDR UK, Gibbs Bldg,215 Euston Rd, London NW1 2BE, England
基金
英国惠康基金; 英国医学研究理事会; 英国工程与自然科学研究理事会; 英国经济与社会研究理事会;
关键词
Radiology; Big Data; AI; ML; RADIOMICS; FACE;
D O I
10.1093/gigascience/giaa095
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Aim: To enable a world-leading research dataset of routinely collected clinical images linked to other routinely collected data from the whole Scottish national population. This includes more than 30 million different radiological examinations from a population of 5.4 million and >2 PB of data collected since 2010. Methods: Scotland has a central archive of radiological data used to directly provide clinical care to patients. We have developed an architecture and platform to securely extract a copy of those data, link it to other clinical or social datasets, remove personal data to protect privacy, and make the resulting data available to researchers in a controlled Safe Haven environment. Results: An extensive software platform has been developed to host, extract, and link data from cohorts to answer research questions. The platform has been tested on 5 different test cases and is currently being further enhanced to support 3 exemplar research projects. Conclusions: The data available are from a range of radiological modalities and scanner types and were collected under different environmental conditions. These real-world, heterogenous data are valuable for training algorithms to support clinical decision making, especially for deep learning where large data volumes are required. The resource is now available for international research access. The platform and data can support new health research using artificial intelligence and machine learning technologies, as well as enabling discovery science.
引用
收藏
页数:13
相关论文
共 32 条
[1]  
Abbott D., 2008, What is Digital Curation? DCC Briefing Papers: Introduction to Curation
[2]   Images data practices for Semantic Segmentation of Breast Cancer using Deep Neural Network [J].
Ahmed, Luqman ;
Iqbal, Muhammad Munwar ;
Aldabbas, Hamza ;
Khalid, Shehzad ;
Saleem, Yasir ;
Saeed, Saqib .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 14 (11) :15227-15243
[3]  
Amazon, WHAT ARE MICROSERVIC
[4]  
[Anonymous], **DATA OBJECT**, DOI DOI 10.5524/100780
[5]   Implementation of an anonymisation tool for clinical trials using a clinical trial processor integrated with an existing trial patient data information system [J].
Aryanto, Kadek Y. E. ;
Broekema, Andre ;
Oudkerk, Matthijs ;
van Ooijen, Peter M. A. .
EUROPEAN RADIOLOGY, 2012, 22 (01) :144-151
[6]   Data Safe Havens in health research and healthcare [J].
Burton, Paul R. ;
Murtagh, Madeleine J. ;
Boyd, Andy ;
Williams, James B. ;
Dove, Edward S. ;
Wallace, Susan E. ;
Tasse, Anne-Marie ;
Little, Julian ;
Chisholm, Rex L. ;
Gaye, Amadou ;
Hveem, Kristian ;
Brookes, Anthony J. ;
Goodwin, Pat ;
Fistein, Jon ;
Bobrow, Martin ;
Knoppers, Bartha M. .
BIOINFORMATICS, 2015, 31 (20) :3241-3248
[7]   Predicting survival time of lung cancer patients using radiomic analysis [J].
Chaddad, Ahmad ;
Desrosiers, Christian ;
Toews, Matthew ;
Abdulkarim, Bassam .
ONCOTARGET, 2017, 8 (61) :104393-104407
[8]   Machine learning applications in prostate cancer magnetic resonance imaging [J].
Cuocolo, Renato ;
Cipullo, Maria Brunella ;
Stanzione, Arnaldo ;
Ugga, Lorenzo ;
Romeo, Valeria ;
Radice, Leonardo ;
Brunetti, Arturo ;
Imbriaco, Massimo .
EUROPEAN RADIOLOGY EXPERIMENTAL, 2019, 3 (01)
[9]   Radiomics: Images Are More than Pictures, They Are Data [J].
Gillies, Robert J. ;
Kinahan, Paul E. ;
Hricak, Hedvig .
RADIOLOGY, 2016, 278 (02) :563-577
[10]   A deep learning methodology for improved breast cancer diagnosis using multiparametric MRI [J].
Hu, Qiyuan ;
Whitney, Heather M. ;
Giger, Maryellen L. .
SCIENTIFIC REPORTS, 2020, 10 (01)