Cluster analysis and visualisation of electronic health records data to identify undiagnosed patients with rare genetic diseases

被引:3
|
作者
Moynihan, Daniel [1 ]
Monaco, Sean [2 ]
Ting, Teck Wah [3 ,4 ]
Narasimhalu, Kaavya [4 ,5 ]
Hsieh, Jenny [4 ,6 ]
Kam, Sylvia [3 ,4 ]
Lim, Jiin Ying [3 ,4 ]
Lim, Weng Khong [4 ,7 ,8 ,9 ]
Davila, Sonia [4 ,7 ]
Bylstra, Yasmin [4 ,7 ]
Balakrishnan, Iswaree Devi [4 ,10 ]
Heng, Mark [11 ]
Chia, Elian [11 ]
Yeo, Khung Keong [10 ]
Goh, Bee Keow [12 ]
Gupta, Ritu [1 ]
Tan, Tele [1 ]
Baynam, Gareth [13 ,14 ]
Jamuar, Saumya Shekhar [3 ,4 ,7 ]
机构
[1] Curtin Univ, Perth, Australia
[2] Hlth Catalyst, South Jordan, UT USA
[3] KK Womens & Childrens Hosp, Dept Paediat, Genet Serv, 100 Bukit Timah Rd, Singapore 229899, Singapore
[4] SingHealth Duke NUS Genom Med Ctr, Singapore, Singapore
[5] Singapore Gen Hosp, Natl Neurosci Inst, Dept Neurol, Singapore, Singapore
[6] Singapore Gen Hosp, Dept Internal Med, Singapore, Singapore
[7] SingHealth Duke NUS Inst Precis Med, Singapore, Singapore
[8] Duke NUS Med Sch, Canc & Stem Cell Biol Program, Singapore, Singapore
[9] Genome Inst Singapore, Lab Genome Variat Analyt, Singapore, Singapore
[10] Natl Heart Ctr Singapore, Singapore, Singapore
[11] SingHealth Off Insights & Analyt, Singapore, Singapore
[12] KK Womens & Childrens Hosp, Data Analyt Off, Singapore, Singapore
[13] Perth Childrens Hosp, Rare Care Ctr, Perth, WA, Australia
[14] Western Australian Register Dev Anomalies, Perth, WA, Australia
关键词
FABRY-DISEASE;
D O I
10.1038/s41598-024-55424-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Rare genetic diseases affect 5-8% of the population but are often undiagnosed or misdiagnosed. Electronic health records (EHR) contain large amounts of data, which provide opportunities for analysing and mining. Data mining, in the form of cluster analysis and visualisation, was performed on a database containing deidentified health records of 1.28 million patients across 3 major hospitals in Singapore, in a bid to improve the diagnostic process for patients who are living with an undiagnosed rare disease, specifically focusing on Fabry Disease and Familial Hypercholesterolaemia (FH). On a baseline of 4 patients, we identified 2 additional patients with potential diagnosis of Fabry disease, suggesting a potential 50% increase in diagnosis. Similarly, we identified > 12,000 individuals who fulfil the clinical and laboratory criteria for FH but had not been diagnosed previously. This proof-of-concept study showed that it is possible to perform mining on EHR data albeit with some challenges and limitations.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] External validation of an algorithm to identify patients with high data-completeness in electronic health Records for Comparative Effectiveness Research
    Lin, Joshua K.
    Jin, Yinzhu
    Glynn, Robert J.
    Schneeweiss, Sebastian
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2019, 28 : 210 - 211
  • [32] Availability of Social Determinants of Health Data for Ophthalmology Patients in Electronic Health Records
    Nayak, Mahasweta
    Lee, Terrence Cheng-Yuan
    Saseendrakumar, Bharanidharan Radha
    Chan, Alison X.
    McDermott, John J.
    Shahrvini, Bita
    Ye, Gordon Y.
    Sitapati, Amy M.
    Nebeker, Camille
    Baxter, Sally
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2022, 63 (07)
  • [33] LEVERAGING ELECTRONIC HEALTH RECORDS AND POLYGENIC RISK SCORES TO IDENTIFY INDIVIDUALS AT HIGH RISK FOR PSYCHIATRIC DISEASES
    Zheutlin, Amanda
    Walsh, Colin
    Ruderfer, Douglas
    Smoller, Jordan
    Choi, Karmel
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2019, 29 : S54 - S55
  • [34] Genetic data and electronic health records: a discussion of ethical, logistical and technological considerations
    Shoenbill, Kimberly
    Fost, Norman
    Tachinardi, Umberto
    Mendonca, Eneida A.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (01) : 171 - 180
  • [35] Early Detection of Diseases using Electronic Health Records Data and Covariance-Regularized Linear Discriminant Analysis
    Bian, Jiang
    Barnes, Laura E.
    Chen, Guanling
    Xiong, Haoyi
    2017 IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL & HEALTH INFORMATICS (BHI), 2017, : 457 - 460
  • [36] Characterizing and Managing Missing Structured Data in Electronic Health Records: Data Analysis
    Beaulieu-Jones, Brett K.
    Lavage, Daniel R.
    Snyder, John W.
    Moore, Jason H.
    Pendergrass, Sarah A.
    Bauer, Christopher R.
    JMIR MEDICAL INFORMATICS, 2018, 6 (01)
  • [37] Analysis of Electronic Health Records to Identify the Patient's Treatment Lines: Challenges and Opportunities
    Najafabadipour, Marjan
    Manuel Tunas, Juan
    Rodriguez-Gonzalez, Alejandro
    Menasalvas, Ernestina
    ARTIFICIAL INTELLIGENCE XXXVI, 2019, 11927 : 437 - 442
  • [38] Shared Genetic Etiology of Autoimmune Diseases in Patients from a Biorepository Linked to De-identified Electronic Health Records
    Restrepo, Nicole A.
    Butkiewicz, Mariusz
    McGrath, Josephine A.
    Crawford, Dana C.
    FRONTIERS IN GENETICS, 2016, 7
  • [39] Characteristics of patients hospitalized for falls: electronic health records analysis
    Schuster, Anna Kathrin
    Kesselmeier, Miriam
    Weisbach, Laura
    Stumme, Christoph
    Behringer, Wilhelm
    Hartmann, Michael
    Farker, Katrin
    JOURNAL OF PUBLIC HEALTH-HEIDELBERG, 2024,
  • [40] Imputation of Missing Data in Electronic Health Records Based on Patients' Similarities
    Jazayeri, Ali
    Liang, Ou Stella
    Yang, Christopher C.
    JOURNAL OF HEALTHCARE INFORMATICS RESEARCH, 2020, 4 (03) : 295 - 307