Distributed learning on 20 000+lung cancer patients - The Personal Health Train

被引:89
|
作者
Deist, Timo M. [1 ,2 ]
Dankers, Frank J. W. M. [1 ,3 ]
Ojha, Priyanka [4 ]
Marshall, M. Scott [4 ]
Janssen, Tomas [4 ]
Faivre-Finn, Corinne [5 ]
Masciocchi, Carlotta [7 ]
Valentini, Vincenzo [6 ,7 ]
Wang, Jiazhou [8 ]
Chen, Jiayan [8 ]
Zhang, Zhen [8 ]
Spezi, Emiliano [9 ,10 ]
Button, Mick [10 ]
Nuyttens, Joost Jan [1 ,11 ]
Vernhout, Rene [11 ]
van Soest, Johan
Jochems, Arthur [2 ]
Monshouwer, Rene [3 ]
Bussink, Johan [3 ]
Price, Gareth [5 ]
Lambin, Philippe [2 ]
Dekker, Andre [1 ]
机构
[1] Maastricht Univ Med Ctr, GROW Sch Oncol & Dev Biol, Dept Radiat Oncol MAASTRO, Maastricht, Netherlands
[2] Maastricht Univ Med Ctr, GROW Sch Oncol & Dev Biol, D Lab Dept Precis Med, Maastricht, Netherlands
[3] Radboud Univ Nijmegen, Med Ctr, Dept Radiat Oncol, Nijmegen, Netherlands
[4] Netherlands Canc Inst Antoni van Leeuwenhoek, Dept Radiat Oncol, Amsterdam, Netherlands
[5] Univ Manchester, Manchester Acad Hlth Sci Ctr, Christie NHS Fdn Trust, Manchester, Lancs, England
[6] Univ Cattolica Sacro Cuore, Milan, Italy
[7] Fdn Policlin Univ A Gemelli IRCCS, Rome, Italy
[8] Fudan Univ, Shanghai Canc Ctr, Dept Radiat Oncol, Dept Oncol,Shanghai Med Coll, Shanghai, Peoples R China
[9] Cardiff Univ, Sch Engn, Cardiff, Wales
[10] Velindre Canc Ctr, Cardiff, Wales
[11] Erasmus MC, Canc Inst, Dept Radiat Oncol, Rotterdam, Netherlands
基金
欧盟地平线“2020”;
关键词
Lung cancer; Big data; Distributed learning; Federated learning; Machine learning; Survival analysis; Prediction modeling; FAIR data; CARE;
D O I
10.1016/j.radonc.2019.11.019
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background and purpose: Access to healthcare data is indispensable for scientific progress and innovation. Sharing healthcare data is time-consuming and notoriously difficult due to privacy and regulatory concerns. The Personal Health Train (PHT) provides a privacy-by-design infrastructure connecting FAIR (Findable, Accessible, Interoperable, Reusable) data sources and allows distributed data analysis and machine learning. Patient data never leaves a healthcare institute. Materials and methods: Lung cancer patient-specific databases (tumor staging and post-treatment survival information) of oncology departments were translated according to a FAIR data model and stored locally in a graph database. Software was installed locally to enable deployment of distributed machine learning algorithms via a central server. Algorithms (MATLAB, code and documentation publicly available) are patient privacy-preserving as only summary statistics and regression coefficients are exchanged with the central server. A logistic regression model to predict post-treatment two-year survival was trained and evaluated by receiver operating characteristic curves (ROC), root mean square prediction error (RMSE) and calibration plots. Results: In 4 months, we connected databases with 23 203 patient cases across 8 healthcare institutes in 5 countries (Amsterdam, Cardiff, Maastricht, Manchester, Nijmegen, Rome, Rotterdam, Shanghai) using the PHT. Summary statistics were computed across databases. A distributed logistic regression model predicting post-treatment two-year survival was trained on 14 810 patients treated between 1978 and 2011 and validated on 8 393 patients treated between 2012 and 2015. Conclusion: The PHT infrastructure demonstrably overcomes patient privacy barriers to healthcare data sharing and enables fast data analyses across multiple institutes from different countries with different regulatory regimens. This infrastructure promotes global evidence-based medicine while prioritizing patient privacy. (C) 2019 The Authors. Published by Elsevier B.V.
引用
收藏
页码:189 / 200
页数:12
相关论文
共 50 条
  • [21] Suboptimal health literacy in patients with lung cancer or head and neck cancer
    Kelvin Koay
    Penelope Schofield
    Karla Gough
    Rachelle Buchbinder
    Danny Rischin
    David Ball
    June Corry
    Richard H. Osborne
    Michael Jefford
    Supportive Care in Cancer, 2013, 21 : 2237 - 2245
  • [22] Suboptimal health literacy in patients with lung cancer or head and neck cancer
    Koay, Kelvin
    Schofield, Penelope
    Gough, Karla
    Buchbinder, Rachelle
    Rischin, Danny
    Ball, David
    Corry, June
    Osborne, Richard H.
    Jefford, Michael
    SUPPORTIVE CARE IN CANCER, 2013, 21 (08) : 2237 - 2245
  • [23] The Philadelphia Lung Cancer Learning Community: a multi-health-system, citywide approach to lung cancer screening
    Barta, Julie A.
    Erkmen, Cherie P.
    Shusted, Christine S.
    Myers, Ronald E.
    Saia, Chelsea
    Cohen, Sarah
    Wainwright, Jocelyn
    Zeigler-Johnson, Charnita
    Dako, Farouk
    Wender, Richard
    Kane, Gregory C.
    Vachani, Anil
    Rendle, Katharine A.
    JNCI CANCER SPECTRUM, 2023, 7 (05)
  • [24] Analysis Machine Learning Based Human Health Lung Cancer Detection
    Asha, V.
    Saravanan, A.
    Anitha, A.
    Fatima Rizvi, Nuzhat
    Kalnawat, Aarti
    Murugesan, G.
    7th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2023 - Proceedings, 2023, : 824 - 828
  • [25] How to Train Your Health: Sports as a Resource to Improve Cognitive Abilities in Cancer Patients
    Sebri, Valeria
    Savioni, Lucrezia
    Triberti, Stefano
    Mazzocco, Ketti
    Pravettoni, Gabriella
    FRONTIERS IN PSYCHOLOGY, 2019, 10
  • [26] Predicting breast cancer risk using personal health data and machine learning models
    Stark, Gigi F.
    Hart, Gregory R.
    Nartowt, Bradley J.
    Deng, Jun
    PLOS ONE, 2019, 14 (12):
  • [27] Cardiovascular Adverse Events in Patients With Cancer Treated With Bevacizumab: A Meta-Analysis of More Than 20 000 Patients
    Totzeck, Matthias
    Mincu, Raluca Ileana
    Rassaf, Tienush
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2017, 6 (08):
  • [28] Personal Health Information Inference Using Machine Learning on RNA Expression Data from Patients With Cancer: Algorithm Validation Study
    Kweon, Solbi
    Lee, Jeong Hoon
    Lee, Younghee
    Park, Yu Rang
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (08)
  • [29] Identification of cachexia in lung cancer patients with an ensemble learning approach
    Jia, Pingping
    Zhao, Qianqian
    Wu, Xiaoxiao
    Shen, Fangqi
    Sun, Kai
    Wang, Xiaolin
    FRONTIERS IN NUTRITION, 2024, 11
  • [30] Machine learning method for biomarkers identification in lung cancer patients
    Delgado-Leon, B. D.
    Moreno, J.
    Cacicedo, J.
    Perez, M.
    Moreno, A.
    Nunez, F. J.
    Delgado, L.
    Perez, S.
    Praena-Fernandez, J. M.
    Montero, E.
    Nieto, J. M.
    Parra, C.
    Ortiz-Gordillo, M. J.
    Lopez-Guerra, J. L.
    RADIOTHERAPY AND ONCOLOGY, 2016, 119 : S321 - S321